EigenPlaces: Training Viewpoint Robust Models for Visual Place Recognition

要約

視覚的場所認識は、視覚的特徴のみに基づいて画像の場所 (クエリと呼ばれます) を予測することを目的としたタスクです。
これは通常、画像検索を通じて行われ、学習されたグローバル記述子を使用して、ジオタグ付きの写真の大規模なデータベースから最も類似した画像とクエリが照合されます。
このタスクの主な課題は、さまざまな視点から見た場所を認識することです。
この制限を克服するために、私たちは、異なる視点からの画像でニューラルネットワークをトレーニングするための、EigenPlaces と呼ばれる新しい方法を提案します。これは、学習されたグローバル記述子に視点のロバスト性を埋め込みます。
根底にある考え方は、トレーニングデータをクラスタリングして、同じ関心点のさまざまなビューをモデルに明示的に提示することです。
この関心のあるポイントの選択は、特別な監督を必要とせずに行われます。
次に、文献にある最も包括的なデータセットのセットに関する実験を紹介します。その結果、EigenPlaces は、トレーニングに必要な GPU メモリが 60% 削減され、使用される記述子が 50% 減少しながら、大部分のデータセットで以前の最先端技術を上回るパフォーマンスを発揮できることがわかりました。
EigenPlaces のコードとトレーニング済みモデルは {\small{\url{https://github.com/gmberton/EigenPlaces}}} で入手できますが、他のベースラインによる結果は {\small{\ のコードベースで計算できます。
URL{https://github.com/gmberton/auto_VPR}}}。

要約(オリジナル)

Visual Place Recognition is a task that aims to predict the place of an image (called query) based solely on its visual features. This is typically done through image retrieval, where the query is matched to the most similar images from a large database of geotagged photos, using learned global descriptors. A major challenge in this task is recognizing places seen from different viewpoints. To overcome this limitation, we propose a new method, called EigenPlaces, to train our neural network on images from different point of views, which embeds viewpoint robustness into the learned global descriptors. The underlying idea is to cluster the training data so as to explicitly present the model with different views of the same points of interest. The selection of this points of interest is done without the need for extra supervision. We then present experiments on the most comprehensive set of datasets in literature, finding that EigenPlaces is able to outperform previous state of the art on the majority of datasets, while requiring 60\% less GPU memory for training and using 50\% smaller descriptors. The code and trained models for EigenPlaces are available at {\small{\url{https://github.com/gmberton/EigenPlaces}}}, while results with any other baseline can be computed with the codebase at {\small{\url{https://github.com/gmberton/auto_VPR}}}.

arxiv情報

著者	Gabriele Berton,Gabriele Trivigno,Barbara Caputo,Carlo Masone
発行日	2023-08-21 16:27:31+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

EigenPlaces: Training Viewpoint Robust Models for Visual Place Recognition

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー