MINDSTORES: Memory-Informed Neural Decision Synthesis for Task-Oriented Reinforcement in Embodied Systems

要約

大規模な言語モデル（LLM）は、具体化されたエージェントのゼロショットプランナーとして有望な能力を示していますが、経験から学び、持続的なメンタルモデルを構築できないことで、Minecraftのような複雑なオープンワールド環境での堅牢性が制限されます。
具体化されたエージェントが環境との自然な相互作用を通じてメンタルモデルを構築および活用できるようにする経験豊富な計画フレームワークであるMindstoresを紹介します。
人間が認知的メンタルモデルを構築および改良する方法からインスピレーションを得て、私たちのアプローチは、将来の計画の反復を知らせる過去の経験のデータベースを維持することにより、既存のゼロショットLLM計画を拡張します。
重要なイノベーションは、蓄積された経験を（状態、タスク、計画、結果）タプルの自然言語の埋め込みとして表しています。これは、LLMプランナーによって効率的に検索および推論され、新しい状態とタスクの洞察を生み出し、計画の改良をガイドすることができます。
Minecraftの低レベルコントロールを提供するMinecraftのエージェントのシミュレーション環境であるMinedojo環境での広範な実験を通じて、Mindstoresは、既存のメモリベースのLLMプランナーよりも知識を学習および適用しながら、柔軟性と一般化の利点を維持しながら、その知識を学習および適用していることがわかります。
ゼロショットアプローチは、自然な経験を通じて継続的に学習できる、より有能な具体化されたAIシステムへの重要なステップを表します。

要約(オリジナル)

While large language models (LLMs) have shown promising capabilities as zero-shot planners for embodied agents, their inability to learn from experience and build persistent mental models limits their robustness in complex open-world environments like Minecraft. We introduce MINDSTORES, an experience-augmented planning framework that enables embodied agents to build and leverage mental models through natural interaction with their environment. Drawing inspiration from how humans construct and refine cognitive mental models, our approach extends existing zero-shot LLM planning by maintaining a database of past experiences that informs future planning iterations. The key innovation is representing accumulated experiences as natural language embeddings of (state, task, plan, outcome) tuples, which can then be efficiently retrieved and reasoned over by an LLM planner to generate insights and guide plan refinement for novel states and tasks. Through extensive experiments in the MineDojo environment, a simulation environment for agents in Minecraft that provides low-level controls for Minecraft, we find that MINDSTORES learns and applies its knowledge significantly better than existing memory-based LLM planners while maintaining the flexibility and generalization benefits of zero-shot approaches, representing an important step toward more capable embodied AI systems that can learn continuously through natural experience.

arxiv情報

著者	Anirudh Chari,Suraj Reddy,Aditya Tiwari,Richard Lian,Brian Zhou
発行日	2025-01-31 17:15:33+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

MINDSTORES: Memory-Informed Neural Decision Synthesis for Task-Oriented Reinforcement in Embodied Systems

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー