Mask3D: Mask Transformer for 3D Semantic Instance Segmentation

要約

タイトル: 3DセマンティックインスタンスセグメンテーションのためのMask Transformer(Mask3D)

要約:
– 現代の3Dセマンティックインスタンスセグメンテーション手法は、主に特殊な投票メカニズムに依存し、その後、注意深く設計された幾何的クラスタリング手法によって構築されています。
– 物体検出や画像セグメンテーションの最近のTransformerベースの手法の成功に基づいて、3Dセマンティックインスタンスセグメンテーションのための最初のTransformerベースの手法を提案します。
– モデル、Mask3Dで、各オブジェクトインスタンスはインスタンスクエリーとして表されます。Transformerデコーダーを使用して、インスタンスクエリーは、複数のスケールでポイントクラウド機能にアテンションを集中して学習されます。ポイント機能と組み合わせて、インスタンスクエリーは並列ですべてのインスタンスマスクを直接生成します。
– Mask3Dには現在の最先端の手法に比べていくつかの利点があります。なぜなら、(1)中心などの手動選択された幾何学的プロパティを必要とする投票スキームに依存しないから、(2)手動で調整されたハイパーパラメータ(例えば半径)を必要とする幾何グループ化メカニズムに依存しないから、(3)インスタンスマスクを直接最適化する損失を可能にするからです。
– Mask3Dは、ScanNetテスト(+6.2 mAP)、S3DIS 6-fold (+10.1 mAP)、STPLS3D (+11.2 mAP)、ScanNet200テスト(+12.4 mAP)で新しい最先端を設定しています。

要約(オリジナル)

Modern 3D semantic instance segmentation approaches predominantly rely on specialized voting mechanisms followed by carefully designed geometric clustering techniques. Building on the successes of recent Transformer-based methods for object detection and image segmentation, we propose the first Transformer-based approach for 3D semantic instance segmentation. We show that we can leverage generic Transformer building blocks to directly predict instance masks from 3D point clouds. In our model called Mask3D each object instance is represented as an instance query. Using Transformer decoders, the instance queries are learned by iteratively attending to point cloud features at multiple scales. Combined with point features, the instance queries directly yield all instance masks in parallel. Mask3D has several advantages over current state-of-the-art approaches, since it neither relies on (1) voting schemes which require hand-selected geometric properties (such as centers) nor (2) geometric grouping mechanisms requiring manually-tuned hyper-parameters (e.g. radii) and (3) enables a loss that directly optimizes instance masks. Mask3D sets a new state-of-the-art on ScanNet test (+6.2 mAP), S3DIS 6-fold (+10.1 mAP), STPLS3D (+11.2 mAP) and ScanNet200 test (+12.4 mAP).

arxiv情報

著者	Jonas Schult,Francis Engelmann,Alexander Hermans,Or Litany,Siyu Tang,Bastian Leibe
発行日	2023-04-12 09:22:53+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, OpenAI

Mask3D: Mask Transformer for 3D Semantic Instance Segmentation

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー