SmallPlan: Leverage Small Language Models for Sequential Path Planning with Simulation-Powered, LLM-Guided Distillation

要約

ロボット工学、特に大規模で動的な環境における効率的な経路計画は、依然として大きなハードルとなっている。大規模言語モデル(LLM)は強力な推論能力を提供するが、計算コストが高く、動的なシナリオへの適応性に限界があるため、エッジデバイスへのリアルタイム展開の妨げとなっている。我々はSmallPlanを発表する。SmallPlanは、LLMを教師モデルとして活用し、高レベルの経路計画タスクのための軽量な小型言語モデル（SLM）を学習する新しいフレームワークである。SmallPlanでは、SLMは、フルスケールの3Dシーンをコンパクトに表現するシーングラフをナビゲートするための最適なアクションシーケンスを提供する。SLMは、LLMに導かれた教師あり微調整（SFT）と強化学習（RL）により、シミュレーションを利用したインターリーブ方式で学習される。この戦略により、SLMはナビゲーションタスクを成功させるだけでなく、移動距離や試行回数などの重要な要素も認識できるようになる。実験を通じて、微調整されたSLMが、幻覚やオーバーフィッティングに悩まされることなく、GPT-4oのような大規模モデルと逐次経路計画で競合する性能を発揮することを実証する。SmallPlanはリソース効率に優れており、エッジデバイスの展開や実用的な自律ロボット工学の発展に適している。

要約(オリジナル)

Efficient path planning in robotics, particularly within large-scale, dynamic environments, remains a significant hurdle. While Large Language Models (LLMs) offer strong reasoning capabilities, their high computational cost and limited adaptability in dynamic scenarios hinder real-time deployment on edge devices. We present SmallPlan — a novel framework leveraging LLMs as teacher models to train lightweight Small Language Models (SLMs) for high-level path planning tasks. In SmallPlan, the SLMs provide optimal action sequences to navigate across scene graphs that compactly represent full-scaled 3D scenes. The SLMs are trained in a simulation-powered, interleaved manner with LLM-guided supervised fine-tuning (SFT) and reinforcement learning (RL). This strategy not only enables SLMs to successfully complete navigation tasks but also makes them aware of important factors like travel distance and number of trials. Through experiments, we demonstrate that the fine-tuned SLMs perform competitively with larger models like GPT-4o on sequential path planning, without suffering from hallucination and overfitting. SmallPlan is resource-efficient, making it well-suited for edge-device deployment and advancing practical autonomous robotics.

arxiv情報

著者	Quang P. M. Pham,Khoi T. N. Nguyen,Nhi H. Doan,Cuong A. Pham,Kentaro Inui,Dezhen Song
発行日	2025-05-01 19:44:36+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, DeepL

SmallPlan: Leverage Small Language Models for Sequential Path Planning with Simulation-Powered, LLM-Guided Distillation

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー