Mobile Robot Navigation Using Hand-Drawn Maps: A Vision Language Model Approach

要約

手描きのマップを使用して、自然で効率的な方法で人間とロボットの間のナビゲーションの指示を伝えることができます。
ただし、これらのマップには、多くの場合、スケールの歪みやモバイルロボットナビゲーションの課題を提示するランドマークの欠落などの不正確さが含まれます。
このペーパーでは、マップの存在下でも、マップの存在下でも、多様な環境、手描きスタイル、ロボットの実施形態を横切るロボットナビゲーションのために、事前に訓練されたビジョン言語モデル（VLM）を活用する新しい手描きのマップナビゲーション（HAM-NAV）アーキテクチャを紹介します。
HAM-NAVは、トポロジカルマップベースの位置推定とナビゲーション計画のためのユニークな選択的視覚的関連付けプロンプトアプローチ、および予測ナビゲーション計画パーサーを統合して、欠落しているランドマークを推測します。
ホイール付きロボットと脚の両方のロボットの両方を使用して、フォトリアリスティックシミュレーション環境で広範な実験が行われ、ナビゲーションの成功率とパスの長さの重み付けの成功の観点からHAM-NAVの有効性を実証しました。
さらに、現実世界の環境でのユーザー調査では、ロボットナビゲーションの手描きマップの実用的なユーティリティと、非描画マップアプローチと比較したナビゲーションの成功を強調しました。

要約(オリジナル)

Hand-drawn maps can be used to convey navigation instructions between humans and robots in a natural and efficient manner. However, these maps can often contain inaccuracies such as scale distortions and missing landmarks which present challenges for mobile robot navigation. This paper introduces a novel Hand-drawn Map Navigation (HAM-Nav) architecture that leverages pre-trained vision language models (VLMs) for robot navigation across diverse environments, hand-drawing styles, and robot embodiments, even in the presence of map inaccuracies. HAM-Nav integrates a unique Selective Visual Association Prompting approach for topological map-based position estimation and navigation planning as well as a Predictive Navigation Plan Parser to infer missing landmarks. Extensive experiments were conducted in photorealistic simulated environments, using both wheeled and legged robots, demonstrating the effectiveness of HAM-Nav in terms of navigation success rates and Success weighted by Path Length. Furthermore, a user study in real-world environments highlighted the practical utility of hand-drawn maps for robot navigation as well as successful navigation outcomes compared against a non-hand-drawn map approach.

arxiv情報

著者	Aaron Hao Tan,Angus Fung,Haitong Wang,Goldie Nejat
発行日	2025-04-28 18:14:08+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

Mobile Robot Navigation Using Hand-Drawn Maps: A Vision Language Model Approach

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー