Guide-LLM: An Embodied LLM Agent and Text-Based Topological Map for Robotic Guidance of People with Visual Impairments

要約

ナビゲーションは、視覚障害のある人（PVI）に大きな課題を提示します。
白い杖や盲導犬などの伝統的な援助は非常に貴重ですが、それらは、希望の場所に詳細な空間情報と正確なガイダンスを提供するのが不足しています。
大規模な言語モデル（LLMS）およびビジョン言語モデル（VLM）の最近の開発は、支援ナビゲーションを強化するための新しい道を提供します。
この論文では、PVIが大きな屋内環境のナビゲートを支援するように設計された具体化されたLLMベースのエージェントであるGuide-LLMを紹介します。
私たちのアプローチは、LLMが単純化された環境表現を使用してグローバルパスを計画できるようにする新しいテキストベースのトポロジマップを特徴としています。
さらに、ユーザーの好みに基づいて、ハザード検出とパーソナライズされたパス計画のためのLLMの常識的な理由を利用しています。
シミュレートされた実験は、PVIの指導におけるシステムの有効性を示しており、支援技術の重要な進歩としての可能性を強調しています。
この結果は、この分野での有望な進歩を指摘して、効率的で適応的でパーソナライズされたナビゲーション支援を提供するガイド-LLMの能力を強調しています。

要約(オリジナル)

Navigation presents a significant challenge for persons with visual impairments (PVI). While traditional aids such as white canes and guide dogs are invaluable, they fall short in delivering detailed spatial information and precise guidance to desired locations. Recent developments in large language models (LLMs) and vision-language models (VLMs) offer new avenues for enhancing assistive navigation. In this paper, we introduce Guide-LLM, an embodied LLM-based agent designed to assist PVI in navigating large indoor environments. Our approach features a novel text-based topological map that enables the LLM to plan global paths using a simplified environmental representation, focusing on straight paths and right-angle turns to facilitate navigation. Additionally, we utilize the LLM’s commonsense reasoning for hazard detection and personalized path planning based on user preferences. Simulated experiments demonstrate the system’s efficacy in guiding PVI, underscoring its potential as a significant advancement in assistive technology. The results highlight Guide-LLM’s ability to offer efficient, adaptive, and personalized navigation assistance, pointing to promising advancements in this field.

arxiv情報

著者	Sangmim Song,Sarath Kodagoda,Amal Gunatilake,Marc G. Carmichael,Karthick Thiyagarajan,Jodi Martin
発行日	2025-03-11 23:45:58+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

Guide-LLM: An Embodied LLM Agent and Text-Based Topological Map for Robotic Guidance of People with Visual Impairments

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー