LaF-GRPO: In-Situ Navigation Instruction Generation for the Visually Impaired via GRPO with LLM-as-Follower Reward

要約

視覚障害のある（VI）個人（NIG-VI）のナビゲーション命令生成は重要ですが、比較的目立たないものです。
したがって、この研究は、VIユーザーが実際に使用できる、正確で標準の、段階的なナビゲーション指示の作成に焦点を当てています。
具体的には、LAF-GRPO（LLM-As-Follower GRPO）を提案します。LLMは、VIユーザー応答をシミュレートして、トレーニング後のVision言語モデル（VLM）をガイドする報酬を生成します。
これにより、教育の使いやすさが向上し、費用のかかる現実世界のデータのニーズを減らします。
トレーニングとテストを容易にするために、27KサンプルのオープンソースベンチマークであるNIG4VIを紹介します。
正確な空間座標を備えた多様なナビゲーションシナリオを提供し、詳細でオープンエンドのインシトゥ命令生成をサポートします。
NIG4VIでの実験は、定量的メトリックによるLAF-GRPOの有効性を示しています（例：ゼロ（LAF-GRPO）BLU +14 \％; SFT +（LAF-GRPO）Meteor 0.542対GPT-4Oの0.323）を示し、より控えめな指示をもたらします。
コードとベンチマークは、\ href {https://github.com/yiyiizhao/nig4vi} {https://github.com/yiyiyizhao/nig4vi}で入手できます。

要約(オリジナル)

Navigation instruction generation for visually impaired (VI) individuals (NIG-VI) is critical yet relatively underexplored. This study, hence, focuses on producing precise, in-situ, step-by-step navigation instructions that are practically usable by VI users. Concretely, we propose LaF-GRPO (LLM-as-Follower GRPO), where an LLM simulates VI user responses to generate rewards guiding the Vision-Language Model (VLM) post-training. This enhances instruction usability while reducing costly real-world data needs. To facilitate training and testing, we introduce NIG4VI, a 27k-sample open-sourced benchmark. It provides diverse navigation scenarios with accurate spatial coordinates, supporting detailed, open-ended in-situ instruction generation. Experiments on NIG4VI show the effectiveness of LaF-GRPO by quantitative metrics (e.g., Zero-(LaF-GRPO) boosts BLEU +14\%; SFT+(LaF-GRPO) METEOR 0.542 vs. GPT-4o’s 0.323) and yields more intuitive, safer instructions. Code and benchmark are available at \href{https://github.com/YiyiyiZhao/NIG4VI}{https://github.com/YiyiyiZhao/NIG4VI}.

arxiv情報

著者	Yi Zhao,Siqi Wang,Jing Li
発行日	2025-06-04 15:34:33+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

LaF-GRPO: In-Situ Navigation Instruction Generation for the Visually Impaired via GRPO with LLM-as-Follower Reward

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー