UniLoc: Towards Universal Place Recognition Using Any Single Modality

要約

現在までのところ、ほとんどの場所認識方法は単一モダリティの検索に焦点を当てています。
クロスモーダル方式は特定の環境で良好に機能しますが、マップソースとクエリソース間のシームレスな切り替えが可能になるため、柔軟性が高まります。
また、統一されたモデルを持つことで計算要件が軽減され、パラメーターを共有することでサンプル効率が向上することも約束されています。
この研究では、単一のクエリモダリティ (自然言語、画像、点群) で動作する、場所認識のためのユニバーサルソリューション UniLoc を開発します。
UniLoc は、大規模な対比学習における最近の進歩を活用し、インスタンスレベルのマッチングとシーンレベルのマッチングの 2 つのレベルで階層的にマッチングすることによって学習します。
具体的には、場所レベルの記述子に集約された場合のインスタンス記述子の重要性を評価するための、新しいセルフアテンションベースプーリング (SAP) モジュールを提案します。
KITTI-360 データセットの実験では、場所認識におけるクロスモダリティの利点が実証され、クロスモーダル設定で優れたパフォーマンスを達成し、ユニモーダルシナリオでも競争力のある結果が得られます。
私たちのプロジェクトページは https://yan-xia.github.io/projects/UniLoc/ で公開されています。

要約(オリジナル)

To date, most place recognition methods focus on single-modality retrieval. While they perform well in specific environments, cross-modal methods offer greater flexibility by allowing seamless switching between map and query sources. It also promises to reduce computation requirements by having a unified model, and achieving greater sample efficiency by sharing parameters. In this work, we develop a universal solution to place recognition, UniLoc, that works with any single query modality (natural language, image, or point cloud). UniLoc leverages recent advances in large-scale contrastive learning, and learns by matching hierarchically at two levels: instance-level matching and scene-level matching. Specifically, we propose a novel Self-Attention based Pooling (SAP) module to evaluate the importance of instance descriptors when aggregated into a place-level descriptor. Experiments on the KITTI-360 dataset demonstrate the benefits of cross-modality for place recognition, achieving superior performance in cross-modal settings and competitive results also for uni-modal scenarios. Our project page is publicly available at https://yan-xia.github.io/projects/UniLoc/.

arxiv情報

著者	Yan Xia,Zhendong Li,Yun-Jin Li,Letian Shi,Hu Cao,João F. Henriques,Daniel Cremers
発行日	2024-12-16 18:48:58+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

UniLoc: Towards Universal Place Recognition Using Any Single Modality

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー