Self-Supervised Visual Place Recognition by Mining Temporal and Feature Neighborhoods

要約

ディープネットワークを利用した視覚的場所認識（VPR）は、最先端のパフォーマンスを実現しています。
ただし、それらのほとんどは、教師あり学習のために各観測の空間近傍の正と負のサンプルを取得するために、グラウンドトゥルースセンサーポーズを含むトレーニングセットを必要とします。
そのような情報が利用できない場合、連続して収集されたデータストリームからの一時的な近傍は、自己教師ありトレーニングに悪用される可能性がありますが、そのパフォーマンスは最適ではないことがわかります。
ノイズの多いラベル学習に着想を得て、一時的近傍と学習可能な特徴近傍を使用して未知の空間的近傍を発見する \textit{TF-VPR} という名前の新しい自己教師ありフレームワークを提案します。
私たちの方法は、（1）データ拡張による表現学習、（2）現在の特徴空間の近傍を含めるための正のセット拡張、および（3）幾何学的検証による正のセットの縮小を交互に繰り返す反復トレーニングパラダイムに従います。
RGB 画像または点群を入力として、シミュレートされたデータセットと実際のデータセットの両方で包括的な実験を行います。
結果は、私たちの方法が再現率、堅牢性、見出しの多様性においてベースラインよりも優れていることを示しています。これは、VPR に対して提案する新しい指標です。
私たちのコードとデータセットは、https://ai4ce.github.io/TF-VPR/ にあります。

要約(オリジナル)

Visual place recognition (VPR) using deep networks has achieved state-of-the-art performance. However, most of them require a training set with ground truth sensor poses to obtain positive and negative samples of each observation’s spatial neighborhood for supervised learning. When such information is unavailable, temporal neighborhoods from a sequentially collected data stream could be exploited for self-supervised training, although we find its performance suboptimal. Inspired by noisy label learning, we propose a novel self-supervised framework named \textit{TF-VPR} that uses temporal neighborhoods and learnable feature neighborhoods to discover unknown spatial neighborhoods. Our method follows an iterative training paradigm which alternates between: (1) representation learning with data augmentation, (2) positive set expansion to include the current feature space neighbors, and (3) positive set contraction via geometric verification. We conduct comprehensive experiments on both simulated and real datasets, with either RGB images or point clouds as inputs. The results show that our method outperforms our baselines in recall rate, robustness, and heading diversity, a novel metric we propose for VPR. Our code and datasets can be found at https://ai4ce.github.io/TF-VPR/.

arxiv情報

著者	Chao Chen,Xinhao Liu,Xuchu Xu,Yiming Li,Li Ding,Ruoyu Wang,Chen Feng
発行日	2022-08-19 12:59:46+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

Self-Supervised Visual Place Recognition by Mining Temporal and Feature Neighborhoods

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー