SmallPlan: Leverage Small Language Models for Sequential Path Planning with Simulation-Powered, LLM-Guided Distillation

要約

ロボット工学の効率的なパス計画、特に大規模で動的な環境内では、依然として重要なハードルです。
大規模な言語モデル（LLM）は強力な推論機能を提供しますが、その高い計算コストと動的シナリオでの適応性が限られていることは、エッジデバイスでのリアルタイムの展開を妨げます。
SmallPlanを提示します。これは、高レベルのパス計画タスクのために軽量の小言語モデル（SLM）をトレーニングするための教師モデルとしてLLMを活用する新しいフレームワークです。
Smallplanでは、SLMSは、フルスケールの3Dシーンをコンパクトに表すシーングラフを横断する最適なアクションシーケンスを提供します。
SLMは、LLMガイド付きの監視施設微調整（SFT）および補強学習（RL）を使用して、シミュレーション駆動のインターリーブされた方法でトレーニングされています。
この戦略により、SLMSはナビゲーションタスクを正常に完了することを可能にするだけでなく、移動距離や試験数などの重要な要因を認識させることができます。
実験を通じて、微調整されたSLMSは、幻覚や過剰フィッティングに苦しむことなく、シーケンシャルパス計画でGPT-4Oなどのより大きなモデルと競合することを実証します。
Smallplanはリソース効率が高く、エッジデバイスの展開と実用的な自律的ロボット工学の進歩に適しています。

要約(オリジナル)

Efficient path planning in robotics, particularly within large-scale, dynamic environments, remains a significant hurdle. While Large Language Models (LLMs) offer strong reasoning capabilities, their high computational cost and limited adaptability in dynamic scenarios hinder real-time deployment on edge devices. We present SmallPlan — a novel framework leveraging LLMs as teacher models to train lightweight Small Language Models (SLMs) for high-level path planning tasks. In SmallPlan, the SLMs provide optimal action sequences to navigate across scene graphs that compactly represent full-scaled 3D scenes. The SLMs are trained in a simulation-powered, interleaved manner with LLM-guided supervised fine-tuning (SFT) and reinforcement learning (RL). This strategy not only enables SLMs to successfully complete navigation tasks but also makes them aware of important factors like travel distance and number of trials. Through experiments, we demonstrate that the fine-tuned SLMs perform competitively with larger models like GPT-4o on sequential path planning, without suffering from hallucination and overfitting. SmallPlan is resource-efficient, making it well-suited for edge-device deployment and advancing practical autonomous robotics.

arxiv情報

著者	Quang P. M. Pham,Khoi T. N. Nguyen,Nhi H. Doan,Cuong A. Pham,Kentaro Inui,Dezhen Song
発行日	2025-05-08 13:12:24+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

SmallPlan: Leverage Small Language Models for Sequential Path Planning with Simulation-Powered, LLM-Guided Distillation

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー