econSG: Efficient and Multi-view Consistent Open-Vocabulary 3D Semantic Gaussians

要約

空色の神経分野に関する最新の作品の主な焦点は、VLMSから正確なセマンティック機能を抽出し、それらを効率的にマルチビューの一貫した3Dニューラルフィールド表現に統合することです。
ただし、ほとんどの既存の作品は、SAMを覆し、さらに改良せずに画像レベルのクリップを正規化しました。
さらに、いくつかの既存の作品は、3DGSセマンティックフィールドと融合する前に、2D VLMSからのセマンティック機能の次元低下により効率を改善し、必然的にマルチビューの矛盾につながります。
この作業では、3DGSを使用したオープンホキャブラリーセマンティックセグメンテーションのECONSGを提案します。
ECONSGは次のとおりです。1）SAMとクリップを相互に改良して、完全かつ正確な境界を持つ正確なセマンティック機能のために両方の世界を最大限に活用する信頼地域のガイド付き正規化（CRR）。
2）バックプロジェクトのマルチビュー2D機能を融合させ、各2Dビューを個別に動作する代わりに融合3D機能で直接次元削減することにより、計算効率を改善しながら3Dマルチビューの一貫性を実施する低次元のコンテキスト空間。
ECONSGは、既存の方法と比較して、4つのベンチマークデータセットで最先端のパフォーマンスを示しています。
さらに、私たちはすべての方法の中で最も効率的なトレーニングでもあります。

要約(オリジナル)

The primary focus of most recent works on open-vocabulary neural fields is extracting precise semantic features from the VLMs and then consolidating them efficiently into a multi-view consistent 3D neural fields representation. However, most existing works over-trusted SAM to regularize image-level CLIP without any further refinement. Moreover, several existing works improved efficiency by dimensionality reduction of semantic features from 2D VLMs before fusing with 3DGS semantic fields, which inevitably leads to multi-view inconsistency. In this work, we propose econSG for open-vocabulary semantic segmentation with 3DGS. Our econSG consists of: 1) A Confidence-region Guided Regularization (CRR) that mutually refines SAM and CLIP to get the best of both worlds for precise semantic features with complete and precise boundaries. 2) A low dimensional contextual space to enforce 3D multi-view consistency while improving computational efficiency by fusing backprojected multi-view 2D features and follow by dimensional reduction directly on the fused 3D features instead of operating on each 2D view separately. Our econSG shows state-of-the-art performance on four benchmark datasets compared to the existing methods. Furthermore, we are also the most efficient training among all the methods.

arxiv情報

著者	Can Zhang,Gim Hee Lee
発行日	2025-04-08 13:12:31+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

econSG: Efficient and Multi-view Consistent Open-Vocabulary 3D Semantic Gaussians

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー