SAMPart3D: Segment Any Part in 3D Objects

要約

3D パーツのセグメンテーションは、3D 認識において重要かつ困難なタスクであり、ロボット工学、3D 生成、3D 編集などのアプリケーションで重要な役割を果たします。
最近の手法では、強力なビジョン言語モデル (VLM) を利用して 2D から 3D への知識を抽出し、ゼロショットの 3D パーツセグメンテーションを実現しています。
ただし、これらの方法はテキストプロンプトに依存しているため制限があり、大規模なラベルなしデータセットへの拡張性や、部分のあいまいさを処理する柔軟性が制限されます。
この作業では、テキストプロンプトとして事前定義されたパーツラベルセットを必要とせずに、あらゆる 3D オブジェクトを複数の粒度でセマンティックパーツにセグメント化する、スケーラブルなゼロショット 3D パーツセグメンテーションフレームワークである SAMPart3D を導入します。
スケーラビリティのために、テキストに依存しないビジョン基盤モデルを使用して 3D 特徴抽出バックボーンを抽出し、ラベルのない大規模な 3D データセットにスケーリングして豊富な 3D 事前分布を学習できるようにします。
柔軟性を高めるために、複数の粒度で 3D パーツをセグメンテーションするために、スケール条件付けされたパーツ認識 3D 特徴を抽出します。
スケール条件付けされたパーツ認識 3D フィーチャからセグメント化されたパーツが取得されたら、VLM を使用して、マルチビューレンダリングに基づいて各パーツにセマンティックラベルを割り当てます。
以前の方法と比較して、当社の SAMPart3D は、最近の大規模 3D オブジェクトデータセット Objaverse に拡張でき、複雑で非日常的なオブジェクトを処理できます。
さらに、既存のベンチマークにおけるオブジェクトとパーツの多様性と複雑さの欠如に対処するために、新しい 3D パーツセグメンテーションベンチマークを提供します。
実験の結果、当社の SAMPart3D は既存のゼロショット 3D パーツセグメンテーション手法を大幅に上回り、パーツレベルの編集やインタラクティブセグメンテーションなどのさまざまなアプリケーションを容易にできることがわかりました。

要約(オリジナル)

3D part segmentation is a crucial and challenging task in 3D perception, playing a vital role in applications such as robotics, 3D generation, and 3D editing. Recent methods harness the powerful Vision Language Models (VLMs) for 2D-to-3D knowledge distillation, achieving zero-shot 3D part segmentation. However, these methods are limited by their reliance on text prompts, which restricts the scalability to large-scale unlabeled datasets and the flexibility in handling part ambiguities. In this work, we introduce SAMPart3D, a scalable zero-shot 3D part segmentation framework that segments any 3D object into semantic parts at multiple granularities, without requiring predefined part label sets as text prompts. For scalability, we use text-agnostic vision foundation models to distill a 3D feature extraction backbone, allowing scaling to large unlabeled 3D datasets to learn rich 3D priors. For flexibility, we distill scale-conditioned part-aware 3D features for 3D part segmentation at multiple granularities. Once the segmented parts are obtained from the scale-conditioned part-aware 3D features, we use VLMs to assign semantic labels to each part based on the multi-view renderings. Compared to previous methods, our SAMPart3D can scale to the recent large-scale 3D object dataset Objaverse and handle complex, non-ordinary objects. Additionally, we contribute a new 3D part segmentation benchmark to address the lack of diversity and complexity of objects and parts in existing benchmarks. Experiments show that our SAMPart3D significantly outperforms existing zero-shot 3D part segmentation methods, and can facilitate various applications such as part-level editing and interactive segmentation.

arxiv情報

著者	Yunhan Yang,Yukun Huang,Yuan-Chen Guo,Liangjun Lu,Xiaoyang Wu,Edmund Y. Lam,Yan-Pei Cao,Xihui Liu
発行日	2024-11-11 17:59:10+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

SAMPart3D: Segment Any Part in 3D Objects

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー