VDNA-PR: Using General Dataset Representations for Robust Sequential Visual Place Recognition

要約

この論文では、一般的なデータセット表現手法を応用して、現実世界の移動ロボットの位置特定を可能にするために重要な堅牢な視覚的場所認識 (VPR) 記述子を生成します。
VPR に関する 2 つの平行した作業により、一方では、汎用の既製の特徴表現がドメインシフトに対する堅牢性を提供できることが示され、もう一方では、一連の画像からの情報を融合することでパフォーマンスが向上することが示されました。
画像データセット間のドメインギャップの測定に関する最近の研究では、画像データセットを表現するためのニューロン活性化の視覚分布 (VDNA) 表現を提案しました。
この表現は、画像シーケンスを自然に処理でき、汎用モデルから派生した一般的で詳細な特徴表現を提供します。
さらに、私たちの表現は、表現する画像のリストにわたるニューロン活性化値の追跡に基づいており、特定のニューラルネットワーク層に限定されないため、高レベルおよび低レベルの概念にアクセスできます。
この研究では、非常に軽量でシンプルなエンコーダーを学習してタスク固有の記述子を生成することにより、VDNA を VPR に使用する方法を示します。
私たちの実験は、屋内環境や航空画像など、トレーニングデータの配布からの深刻なドメインシフトに対して、現在のソリューションよりも優れた堅牢性を実現できることを示しています。

要約(オリジナル)

This paper adapts a general dataset representation technique to produce robust Visual Place Recognition (VPR) descriptors, crucial to enable real-world mobile robot localisation. Two parallel lines of work on VPR have shown, on one side, that general-purpose off-the-shelf feature representations can provide robustness to domain shifts, and, on the other, that fused information from sequences of images improves performance. In our recent work on measuring domain gaps between image datasets, we proposed a Visual Distribution of Neuron Activations (VDNA) representation to represent datasets of images. This representation can naturally handle image sequences and provides a general and granular feature representation derived from a general-purpose model. Moreover, our representation is based on tracking neuron activation values over the list of images to represent and is not limited to a particular neural network layer, therefore having access to high- and low-level concepts. This work shows how VDNAs can be used for VPR by learning a very lightweight and simple encoder to generate task-specific descriptors. Our experiments show that our representation can allow for better robustness than current solutions to serious domain shifts away from the training data distribution, such as to indoor environments and aerial imagery.

arxiv情報

著者	Benjamin Ramtoula,Daniele De Martini,Matthew Gadd,Paul Newman
発行日	2024-03-14 01:30:28+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

VDNA-PR: Using General Dataset Representations for Robust Sequential Visual Place Recognition

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー