FaVoR: Features via Voxel Rendering for Camera Relocalization

要約

カメラの再局在化方法は、密な画像アライメントからクエリ画像からの直接カメラの回帰までの範囲です。
これらの中で、スパース機能のマッチングは、多数のアプリケーションを使用した効率的で、多用途で、一般的に軽量なアプローチとして際立っています。
ただし、機能ベースの方法は、多くの場合、重要な視点や外観の変化に苦しみ、障害と不正確なポーズ推定値に沿っています。
この制限を克服するために、2D機能のグローバルにまばらでありながら局所的に密な3D表現を活用する新しいアプローチを提案します。
一連のフレームでランドマークを追跡および三角測量することにより、追跡中に観察される画像パッチ記述子をレンダリングするように最適化されたスパースボクセルマップを構築します。
最初のポーズ推定値があると、最初にボクセルの記述子をボクセルレンダリングを使用して合成し、次にカメラポーズを推定するためにフィーチャマッチングを実行します。
この方法論により、目に見えないビュー用の記述子の生成が可能になり、変化を表示するための堅牢性が向上します。
7型およびケンブリッジランドマークデータセットに関する方法を広範囲に評価します。
私たちの結果は、我々の方法が屋内環境で既存の最先端の機能表現技術を大幅に上回り、翻訳の中央値が最大39％改善することを示しています。
さらに、私たちのアプローチは、メモリと計算コストの低下を維持しながら、屋外シナリオの他の方法に匹敵する結果をもたらします。

要約(オリジナル)

Camera relocalization methods range from dense image alignment to direct camera pose regression from a query image. Among these, sparse feature matching stands out as an efficient, versatile, and generally lightweight approach with numerous applications. However, feature-based methods often struggle with significant viewpoint and appearance changes, leading to matching failures and inaccurate pose estimates. To overcome this limitation, we propose a novel approach that leverages a globally sparse yet locally dense 3D representation of 2D features. By tracking and triangulating landmarks over a sequence of frames, we construct a sparse voxel map optimized to render image patch descriptors observed during tracking. Given an initial pose estimate, we first synthesize descriptors from the voxels using volumetric rendering and then perform feature matching to estimate the camera pose. This methodology enables the generation of descriptors for unseen views, enhancing robustness to view changes. We extensively evaluate our method on the 7-Scenes and Cambridge Landmarks datasets. Our results show that our method significantly outperforms existing state-of-the-art feature representation techniques in indoor environments, achieving up to a 39% improvement in median translation error. Additionally, our approach yields comparable results to other methods for outdoor scenarios while maintaining lower memory and computational costs.

arxiv情報

著者	Vincenzo Polizzi,Marco Cannici,Davide Scaramuzza,Jonathan Kelly
発行日	2025-05-21 02:39:25+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

FaVoR: Features via Voxel Rendering for Camera Relocalization

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー