Geometrically-driven Aggregation for Zero-shot 3D Point Cloud Understanding

要約

ゼロショット 3D 点群の理解は、2D Vision-Language Model (VLM) を通じて実現できます。
既存の戦略は、固有の表現可能な点群の幾何学的構造を無視して、レンダリングまたはキャプチャされたビューの 2D ピクセルから 3D 点に視覚言語モデルを直接マッピングします。
幾何学的に類似または近い領域は、意味情報を共有する可能性が高いため、点群の理解を強化するために利用できます。
この目的を達成するために、点群の 3D 幾何学的構造を活用して、転送された視覚言語モデルの品質を向上させる、トレーニング不要の初の集約手法を導入します。
私たちのアプローチは反復的に動作し、幾何学的および意味論的なポイントレベル推論に基づいてローカルからグローバルへの集約を実行します。
合成/現実世界、屋内/屋外の両方のシナリオを表すさまざまなデータセットを使用して、分類、パーツセグメンテーション、セマンティックセグメンテーションを含む 3 つの下流タスクに関するアプローチのベンチマークを行います。
私たちのアプローチは、すべてのベンチマークで新しい最先端の結果を達成します。
私たちのアプローチは反復的に動作し、幾何学的および意味論的なポイントレベル推論に基づいてローカルからグローバルへの集約を実行します。
コードとデータセットは https://luigiriz.github.io/geoze-website/ で入手できます。

要約(オリジナル)

Zero-shot 3D point cloud understanding can be achieved via 2D Vision-Language Models (VLMs). Existing strategies directly map Vision-Language Models from 2D pixels of rendered or captured views to 3D points, overlooking the inherent and expressible point cloud geometric structure. Geometrically similar or close regions can be exploited for bolstering point cloud understanding as they are likely to share semantic information. To this end, we introduce the first training-free aggregation technique that leverages the point cloud’s 3D geometric structure to improve the quality of the transferred Vision-Language Models. Our approach operates iteratively, performing local-to-global aggregation based on geometric and semantic point-level reasoning. We benchmark our approach on three downstream tasks, including classification, part segmentation, and semantic segmentation, with a variety of datasets representing both synthetic/real-world, and indoor/outdoor scenarios. Our approach achieves new state-of-the-art results in all benchmarks. Our approach operates iteratively, performing local-to-global aggregation based on geometric and semantic point-level reasoning. Code and dataset are available at https://luigiriz.github.io/geoze-website/

arxiv情報

著者	Guofeng Mei,Luigi Riz,Yiming Wang,Fabio Poiesi
発行日	2024-04-15 10:06:19+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

Geometrically-driven Aggregation for Zero-shot 3D Point Cloud Understanding

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー