GeoDTR+: Toward generic cross-view geolocalization via geometric disentanglement

要約

Cross-View Geo-Localization (CVGL) は、地上画像をデータベース内の地理タグ付き航空画像と照合することで、その画像の位置を推定します。
最近の研究では、CVGL ベンチマークで目覚ましい進歩を遂げています。
ただし、既存の方法では、トレーニングデータとテストデータが完全に異なる領域から取得されるクロス領域評価のパフォーマンスが依然として低いという問題があります。
この欠陥は、視覚的特徴の幾何学的レイアウトを抽出する能力の欠如と、低レベルの詳細に対するモデルの過剰適合に起因すると考えられます。
私たちの予備作業では、入力フィーチャから幾何学的レイアウトをキャプチャするために、Geometric Layout Extractor (GLE) を導入しました。
ただし、以前の GLE は入力機能の情報を十分に活用していません。
この研究では、視覚的特徴間の相関関係をより適切にモデル化する強化された GLE モジュールを備えた GeoDTR+ を提案します。
予備作業から LS テクニックを完全に調査するために、モデルのトレーニングを容易にするために、コントラストハードサンプル生成 (CHSG) をさらに提案します。
広範な実験により、GeoDTR+ は、CVUSA、CVACT、および VIGOR のクロスエリア評価において、大幅な差をつけて最先端 (SOTA) の結果を達成することが示されています ($16.44\%$、$22.71\%$、およびなしの場合は $17.02\%$)
極変換) を実現しながら、既存の SOTA と同等の同じ面積のパフォーマンスを維持します。
さらに、GeoDTR+ の詳細な分析を提供します。

要約(オリジナル)

Cross-View Geo-Localization (CVGL) estimates the location of a ground image by matching it to a geo-tagged aerial image in a database. Recent works achieve outstanding progress on CVGL benchmarks. However, existing methods still suffer from poor performance in cross-area evaluation, in which the training and testing data are captured from completely distinct areas. We attribute this deficiency to the lack of ability to extract the geometric layout of visual features and models’ overfitting to low-level details. Our preliminary work introduced a Geometric Layout Extractor (GLE) to capture the geometric layout from input features. However, the previous GLE does not fully exploit information in the input feature. In this work, we propose GeoDTR+ with an enhanced GLE module that better models the correlations among visual features. To fully explore the LS techniques from our preliminary work, we further propose Contrastive Hard Samples Generation (CHSG) to facilitate model training. Extensive experiments show that GeoDTR+ achieves state-of-the-art (SOTA) results in cross-area evaluation on CVUSA, CVACT, and VIGOR by a large margin ($16.44\%$, $22.71\%$, and $17.02\%$ without polar transformation) while keeping the same-area performance comparable to existing SOTA. Moreover, we provide detailed analyses of GeoDTR+.

arxiv情報

著者	Xiaohan Zhang,Xingyu Li,Waqas Sultani,Chen Chen,Safwan Wshah
発行日	2023-08-18 15:32:01+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

GeoDTR+: Toward generic cross-view geolocalization via geometric disentanglement

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー