HOTFormerLoc: Hierarchical Octree Transformer for Versatile Lidar Place Recognition Across Ground and Aerial Views

要約

都市部と森林環境全体の地上から地面から地上から天の両方のシナリオで、大規模な3D場所認識のために、斬新で汎用性の高い階層的なオクトリーベースの変圧器であるHotformerlocを提示します。
粒度全体で空間的および意味的な特徴をキャプチャするオクトリーベースのマルチスケール注意メカニズムを提案します。
スピニングLIDARからのポイント分布の可変密度に対処するために、円筒形のオクトリー注意ウィンドウを提示して、注意の根元にある分布を反映します。
リレートークンを導入して、効率的なグローバルローカルインタラクションと計算コストを削減してマルチスケール表現学習を可能にします。
ピラミッドの注意プーリングは、挑戦的な環境でエンドツーエンドの場所認識のための堅牢なグローバルな記述子を合成します。
さらに、密な森林で撮影された航空および地上のライダースキャンからのポイントクラウドデータを特徴とする新しい3DクロスソースデータセットであるCS-Wild-Placesを紹介します。
CS-Wild-Placesのポイントクラウドには、さまざまな点密度やノイズパターンなどの表現的なギャップと特徴的な属性が含まれているため、野生でのクロスビューローカリゼーションのための挑戦的なベンチマークとなっています。
HotFormerLocは、CS-Wild-Placesベンチマークで5.5％-11.5％の上位1平均リコール改善を達成します。
さらに、SOTA 3D場所認識方法よりも一貫してアウトパフォームし、確立された都市および森林データセットで平均パフォーマンス増加が4.9％です。
コードとCS-Wild-Placesベンチマークは、https：//csiro-robotics.github.io/hotformerlocで入手できます。

要約(オリジナル)

We present HOTFormerLoc, a novel and versatile Hierarchical Octree-based TransFormer, for large-scale 3D place recognition in both ground-to-ground and ground-to-aerial scenarios across urban and forest environments. We propose an octree-based multi-scale attention mechanism that captures spatial and semantic features across granularities. To address the variable density of point distributions from spinning lidar, we present cylindrical octree attention windows to reflect the underlying distribution during attention. We introduce relay tokens to enable efficient global-local interactions and multi-scale representation learning at reduced computational cost. Our pyramid attentional pooling then synthesises a robust global descriptor for end-to-end place recognition in challenging environments. In addition, we introduce CS-Wild-Places, a novel 3D cross-source dataset featuring point cloud data from aerial and ground lidar scans captured in dense forests. Point clouds in CS-Wild-Places contain representational gaps and distinctive attributes such as varying point densities and noise patterns, making it a challenging benchmark for cross-view localisation in the wild. HOTFormerLoc achieves a top-1 average recall improvement of 5.5% – 11.5% on the CS-Wild-Places benchmark. Furthermore, it consistently outperforms SOTA 3D place recognition methods, with an average performance gain of 4.9% on well-established urban and forest datasets. The code and CS-Wild-Places benchmark is available at https://csiro-robotics.github.io/HOTFormerLoc.

arxiv情報

著者	Ethan Griffiths,Maryam Haghighat,Simon Denman,Clinton Fookes,Milad Ramezani
発行日	2025-03-21 07:00:11+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

HOTFormerLoc: Hierarchical Octree Transformer for Versatile Lidar Place Recognition Across Ground and Aerial Views

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー