LUDVIG: Learning-free Uplifting of 2D Visual features to Gaussian Splatting scenes

要約

私たちは、視覚的特徴やセマンティックマスクを 2D ビジョンモデルからガウススプラッティングによって表現される 3D シーンに引き上げるタスクに取り組みます。
一般的なアプローチは反復的な最適化ベースの手順に依存していますが、シンプルでありながら効果的な集計手法が優れた結果を生み出すことを示します。
Segment Anything (SAM) のセマンティックマスクに適用される当社の高揚したアプローチは、最先端のセグメンテーション品質に匹敵するものになります。
次に、この方法を一般的な DINOv2 機能に拡張し、グラフ拡散を通じて 3D シーンジオメトリを統合し、DINOv2 が SAM のような数百万のアノテーション付きマスクでトレーニングされていないにもかかわらず、競争力のあるセグメンテーション結果を達成しました。

要約(オリジナル)

We address the task of uplifting visual features or semantic masks from 2D vision models to 3D scenes represented by Gaussian Splatting. Whereas common approaches rely on iterative optimization-based procedures, we show that a simple yet effective aggregation technique yields excellent results. Applied to semantic masks from Segment Anything (SAM), our uplifting approach leads to segmentation quality comparable to the state of the art. We then extend this method to generic DINOv2 features, integrating 3D scene geometry through graph diffusion, and achieve competitive segmentation results despite DINOv2 not being trained on millions of annotated masks like SAM.

arxiv情報

著者	Juliette Marrie,Romain Ménégaux,Michael Arbel,Diane Larlus,Julien Mairal
発行日	2024-10-18 13:44:29+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

LUDVIG: Learning-free Uplifting of 2D Visual features to Gaussian Splatting scenes

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー