SM$^3$: Self-Supervised Multi-task Modeling with Multi-view 2D Images for Articulated Objects

要約

実世界の物体を再構成し、その可動関節構造を推定することは、ロボット工学の分野において極めて重要な技術です。
これまでの研究は、主に教師ありアプローチに焦点を当てており、広範囲に注釈が付けられたデータセットに依存して、限られたカテゴリー内で多関節オブジェクトをモデル化していました。
しかし、このアプローチは現実世界に存在する多様性に効果的に対処するには至っていません。
この問題に取り組むために、我々は SM$^3$ と呼ばれる自己教師ありインタラクション知覚手法を提案します。この手法は、インタラクションの前後にキャプチャされた多視点 RGB 画像を活用して、多関節オブジェクトをモデル化し、可動部品を識別し、パラメータを推測します。
回転ジョイントの。
SM$^3$ は、キャプチャした 2D 画像から 3D ジオメトリとテクスチャを構築することで、再構成プロセス中に可動部品と関節パラメータの統合的な最適化を実現し、アノテーションの必要性を排除します。
さらに、PartNet-Mobility の拡張機能である MMArt データセットを導入します。これには、さまざまなカテゴリにわたる多関節オブジェクトのマルチビューおよびマルチモーダルデータが含まれます。
評価では、SM$^3$ がさまざまなカテゴリやオブジェクトにわたって既存のベンチマークを上回っていることが実証されており、現実世界のシナリオでの適応性が徹底的に検証されています。

要約(オリジナル)

Reconstructing real-world objects and estimating their movable joint structures are pivotal technologies within the field of robotics. Previous research has predominantly focused on supervised approaches, relying on extensively annotated datasets to model articulated objects within limited categories. However, this approach falls short of effectively addressing the diversity present in the real world. To tackle this issue, we propose a self-supervised interaction perception method, referred to as SM$^3$, which leverages multi-view RGB images captured before and after interaction to model articulated objects, identify the movable parts, and infer the parameters of their rotating joints. By constructing 3D geometries and textures from the captured 2D images, SM$^3$ achieves integrated optimization of movable part and joint parameters during the reconstruction process, obviating the need for annotations. Furthermore, we introduce the MMArt dataset, an extension of PartNet-Mobility, encompassing multi-view and multi-modal data of articulated objects spanning diverse categories. Evaluations demonstrate that SM$^3$ surpasses existing benchmarks across various categories and objects, while its adaptability in real-world scenarios has been thoroughly validated.

arxiv情報

著者	Haowen Wang,Zhen Zhao,Zhao Jin,Zhengping Che,Liang Qiao,Yakun Huang,Zhipeng Fan,Xiuquan Qiao,Jian Tang
発行日	2024-01-17 11:15:09+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

SM$^3$: Self-Supervised Multi-task Modeling with Multi-view 2D Images for Articulated Objects

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー