Open3DIS: Open-Vocabulary 3D Instance Segmentation with 2D Mask Guidance

要約

3D シーン内のオープン語彙インスタンスセグメンテーションの問題に取り組むために設計された新しいソリューションである Open3DIS を紹介します。
3D 環境内のオブジェクトは多様な形状、スケール、色を示すため、インスタンスレベルでの正確な識別は困難な作業になります。
Open-Vocabulary シーンの理解における最近の進歩は、オブジェクトの位置特定と各 3D マスクのクエリ可能な特徴の学習にクラスに依存しない 3D インスタンス提案ネットワークを採用することにより、この分野で大幅な進歩を遂げました。
これらの方法は高品質のインスタンス提案を生成しますが、小規模で幾何学的に曖昧なオブジェクトを識別するのに苦労します。
私たちの方法の重要なアイデアは、フレーム全体で 2D インスタンスマスクを集約し、それらを上記の制限に対処する高品質のオブジェクト提案として幾何学的に一貫した点群領域にマッピングする新しいモジュールです。
次に、これらは 3D クラスに依存しないインスタンスの提案と組み合わされて、現実世界の幅広いオブジェクトが含まれます。
私たちのアプローチを検証するために、ScanNet200、S3DIS、Replica を含む 3 つの著名なデータセットで実験を実施し、最先端のアプローチと比較して、さまざまなカテゴリでオブジェクトをセグメント化する際のパフォーマンスが大幅に向上することを実証しました。

要約(オリジナル)

We introduce Open3DIS, a novel solution designed to tackle the problem of Open-Vocabulary Instance Segmentation within 3D scenes. Objects within 3D environments exhibit diverse shapes, scales, and colors, making precise instance-level identification a challenging task. Recent advancements in Open-Vocabulary scene understanding have made significant strides in this area by employing class-agnostic 3D instance proposal networks for object localization and learning queryable features for each 3D mask. While these methods produce high-quality instance proposals, they struggle with identifying small-scale and geometrically ambiguous objects. The key idea of our method is a new module that aggregates 2D instance masks across frames and maps them to geometrically coherent point cloud regions as high-quality object proposals addressing the above limitations. These are then combined with 3D class-agnostic instance proposals to include a wide range of objects in the real world. To validate our approach, we conducted experiments on three prominent datasets, including ScanNet200, S3DIS, and Replica, demonstrating significant performance gains in segmenting objects with diverse categories over the state-of-the-art approaches.

arxiv情報

著者	Phuc D. A. Nguyen,Tuan Duc Ngo,Chuang Gan,Evangelos Kalogerakis,Anh Tran,Cuong Pham,Khoi Nguyen
発行日	2024-03-31 18:37:10+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

Open3DIS: Open-Vocabulary 3D Instance Segmentation with 2D Mask Guidance

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー