Language Models can Infer Action Semantics for Classical Planners from Environment Feedback

要約

古典的な計画アプローチでは、可能な場合には特定の目標状態を達成できる一連のアクションを見つけることが保証されますが、専門家が環境のダイナミクスを制御する論理アクションセマンティクスを指定する必要があります。
研究者らは、大規模言語モデル (LLM) を使用して、常識的な知識と最小限のドメイン情報のみに基づいて計画ステップを直接推論できることを示しましたが、そのような計画は実行時に失敗することがよくあります。
古典的な計画と LLM 常識推論の長所を組み合わせて、ドメイン誘導を実行し、環境自体との閉ループ相互作用に基づいてアクションの事前条件と事後条件を学習および検証します。
我々は、LLM 推論を活用して、部分的なドメイン知識が与えられた古典的なプランナーによって生成された部分的な計画をヒューリスティックに完成させるだけでなく、実行後の環境フィードバックに基づいて論理言語でドメインの意味論的な規則を推論する PSALM を提案します。
7 つの環境に関する分析では、専門家が厳選した 1 つのサンプルプランだけで、ヒューリスティックプランナーおよびルールプレディクターとして LLM を使用すると、ランダム探索よりも低い環境実行ステップと環境リセットを実現しながら、同時にドメインの基礎となるグラウンドトゥルースアクションセマンティクスを回復できることがわかりました。

要約(オリジナル)

Classical planning approaches guarantee finding a set of actions that can achieve a given goal state when possible, but require an expert to specify logical action semantics that govern the dynamics of the environment. Researchers have shown that Large Language Models (LLMs) can be used to directly infer planning steps based on commonsense knowledge and minimal domain information alone, but such plans often fail on execution. We bring together the strengths of classical planning and LLM commonsense inference to perform domain induction, learning and validating action pre- and post-conditions based on closed-loop interactions with the environment itself. We propose PSALM, which leverages LLM inference to heuristically complete partial plans emitted by a classical planner given partial domain knowledge, as well as to infer the semantic rules of the domain in a logical language based on environment feedback after execution. Our analysis on 7 environments shows that with just one expert-curated example plans, using LLMs as heuristic planners and rule predictors achieves lower environment execution steps and environment resets than random exploration while simultaneously recovering the underlying ground truth action semantics of the domain.

arxiv情報

著者	Wang Zhu,Ishika Singh,Robin Jia,Jesse Thomason
発行日	2024-06-04 21:29:56+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

Language Models can Infer Action Semantics for Classical Planners from Environment Feedback

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー