TPTU: Task Planning and Tool Usage of Large Language Model-based AI Agents

要約

自然言語処理の最近の進歩により、大規模言語モデル (LLM) が、現実世界のさまざまなアプリケーションのための強力なツールとして登場しました。
LLM の優れた能力にもかかわらず、LLM の固有の生成能力は、タスク計画と外部ツールの使用の組み合わせが必要な複雑なタスクを処理するには不十分であることが判明する可能性があります。
このペーパーでは、まず LLM ベースの AI エージェントに合わせた構造化フレームワークを提案し、複雑な問題に取り組むために必要な重要な機能について説明します。
このフレームワーク内で、推論プロセスを実行するための 2 つの異なるタイプのエージェント (つまり、ワンステップエージェントとシーケンシャルエージェント) を設計します。
次に、さまざまな LLM を使用してフレームワークをインスタンス化し、一般的なタスクでのタスクプランニングとツール使用 (TPTU) 能力を評価します。
主要な発見と課題を強調することで、私たちの目標は、研究者や実践者が AI アプリケーションで LLM の力を活用するための役立つリソースを提供することです。
私たちの研究では、これらのモデルの大きな可能性を強調すると同時に、さらなる調査と改善が必要な領域も特定しています。

要約(オリジナル)

With recent advancements in natural language processing, Large Language Models (LLMs) have emerged as powerful tools for various real-world applications. Despite their prowess, the intrinsic generative abilities of LLMs may prove insufficient for handling complex tasks which necessitate a combination of task planning and the usage of external tools. In this paper, we first propose a structured framework tailored for LLM-based AI Agents and discuss the crucial capabilities necessary for tackling intricate problems. Within this framework, we design two distinct types of agents (i.e., one-step agent and sequential agent) to execute the inference process. Subsequently, we instantiate the framework using various LLMs and evaluate their Task Planning and Tool Usage (TPTU) abilities on typical tasks. By highlighting key findings and challenges, our goal is to provide a helpful resource for researchers and practitioners to leverage the power of LLMs in their AI applications. Our study emphasizes the substantial potential of these models, while also identifying areas that need more investigation and improvement.

arxiv情報

著者	Jingqing Ruan,Yihong Chen,Bin Zhang,Zhiwei Xu,Tianpeng Bao,Guoqing Du,Shiwei Shi,Hangyu Mao,Xingyu Zeng,Rui Zhao
発行日	2023-08-07 09:22:03+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

TPTU: Task Planning and Tool Usage of Large Language Model-based AI Agents

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー