GrabS: Generative Embodied Agent for 3D Object Segmentation without Scene Supervision

要約

監督のために3Dシーンの人間のラベルを必要とせずに、複雑な点雲における3Dオブジェクトセグメンテーションの困難な問題を研究します。
前提条件の2D機能の類似性またはオブジェクトとして3Dポイントをグループ化するモーションなどの外部信号の類似性に依存することにより、既存の監視なしの方法は通常、車やそのセグメント化されたオブジェクトなどの単純なオブジェクトを識別することに限定されます。
この論文では、Grabsと呼ばれる新しい2段階のパイプラインを提案します。
私たちの方法の核となる概念は、第1段階のオブジェクトデータセットからの基礎として生成的で識別的なオブジェクト中心のプライアーを学習し、具体化されたエージェントを設計して、第2段階で前提条件の生成プライアーに対してクエリすることで複数のオブジェクトを発見することを学ぶことです。
2つの実際のデータセットと新しく作成された合成データセットでの方法を広範囲に評価し、顕著なセグメンテーションパフォーマンスを実証し、既存のすべての監視なしの方法を明確に上回ります。

要約(オリジナル)

We study the hard problem of 3D object segmentation in complex point clouds without requiring human labels of 3D scenes for supervision. By relying on the similarity of pretrained 2D features or external signals such as motion to group 3D points as objects, existing unsupervised methods are usually limited to identifying simple objects like cars or their segmented objects are often inferior due to the lack of objectness in pretrained features. In this paper, we propose a new two-stage pipeline called GrabS. The core concept of our method is to learn generative and discriminative object-centric priors as a foundation from object datasets in the first stage, and then design an embodied agent to learn to discover multiple objects by querying against the pretrained generative priors in the second stage. We extensively evaluate our method on two real-world datasets and a newly created synthetic dataset, demonstrating remarkable segmentation performance, clearly surpassing all existing unsupervised methods.

arxiv情報

著者	Zihui Zhang,Yafei Yang,Hongtao Wen,Bo Yang
発行日	2025-04-16 04:13:53+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

GrabS: Generative Embodied Agent for 3D Object Segmentation without Scene Supervision

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー