Learning to Plan with Natural Language

要約

大規模言語モデル (LLM) は、さまざまな基本的な自然言語タスクで顕著なパフォーマンスを示しています。
複雑なタスクを完了するには、LLM が特定のソリューションを段階的に生成できるように、タスクの計画を立てる必要があります。
LLM はタスクプランを直接生成できますが、これらのプランには依然として事実上の誤りが含まれているか、不完全である可能性があります。
高品質のタスク計画には、あらゆる状況を解決するための正しい段階的な解決策と、間違いを避けるための行動指示が含まれています。
これを取得するために、我々は計画学習法を提案します。これには 2 つのフェーズが含まれます。 (1) 最初の学習タスク計画フェーズでは、新しいステップバイステップの解決策と行動指示でタスク計画を繰り返し更新します。
LLM にトレーニングエラーのフィードバックから導き出すよう促します。
(2) 後続のテストフェーズでは、LLM は学習したタスクプランを使用して、テストセットでの LLM の推論をガイドします。
5 つの異なる推論タイプのタスク (8 つのデータセット) に対するこの方法の有効性を実証します。
さらに、私たちの分析実験は、1 つの LLM によって学習されたタスクプランが別の LLM のパフォーマンスを向上させるために直接導くことができることを示し、これは新しい転移学習パラダイムを明らかにします。
\url{https://github.com/Eureka6174/LearnNLPlan} でコードをリリースします。

要約(オリジナル)

Large Language Models (LLMs) have shown remarkable performance in various basic natural language tasks. For completing the complex task, we still need a plan for the task to guide LLMs to generate the specific solutions step by step. LLMs can directly generate task plans, but these plans may still contain factual errors or are incomplete. A high-quality task plan contains correct step-by-step solutions for solving all situations and behavioral instructions for avoiding mistakes. To obtain it, we propose the Learning to Plan method, which involves two phases: (1) In the first learning task plan phase, it iteratively updates the task plan with new step-by-step solutions and behavioral instructions, which are obtained by prompting LLMs to derive from training error feedback. (2) In the subsequent test phase, the LLM uses the learned task plan to guide the inference of LLM on the test set. We demonstrate the effectiveness of our method on the five different reasoning type tasks (8 datasets). Further, our analysis experiment shows that the task plan learned by one LLM can directly guide another LLM to improve its performance, which reveals a new transfer learning paradigm. We release the code at \url{https://github.com/Eureka6174/LearnNLPlan}

arxiv情報

著者	Yiduo Guo,Yaobo Liang,Chenfei Wu,Wenshan Wu,Dongyan Zhao,Nan Duan
発行日	2023-12-13 02:08:50+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

Learning to Plan with Natural Language

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー