MaestroMotif: Skill Design from Artificial Intelligence Feedback

要約

自然言語でスキルを記述することは、意思決定に関する人間の知識を AI システムに注入するアクセス可能な方法を提供する可能性があります。
高性能で適応性の高いエージェントを生み出す AI 支援スキル設計手法 MaestroMotif を紹介します。
MaestroMotif は、ラージ言語モデル (LLM) の機能を活用して、スキルを効果的に作成および再利用します。
まず、LLM のフィードバックを使用して、自然言語の説明から始めて、各スキルに対応する報酬を自動的に設計します。
次に、LLM のコード生成機能と強化学習を併用してスキルをトレーニングし、それらを組み合わせて言語で指定された複雑な動作を実装します。
NetHack 学習環境 (NLE) の一連の複雑なタスクを使用して MaestroMotif を評価し、パフォーマンスと使いやすさの両方で既存のアプローチを上回ることを実証しました。

要約(オリジナル)

Describing skills in natural language has the potential to provide an accessible way to inject human knowledge about decision-making into an AI system. We present MaestroMotif, a method for AI-assisted skill design, which yields high-performing and adaptable agents. MaestroMotif leverages the capabilities of Large Language Models (LLMs) to effectively create and reuse skills. It first uses an LLM’s feedback to automatically design rewards corresponding to each skill, starting from their natural language description. Then, it employs an LLM’s code generation abilities, together with reinforcement learning, for training the skills and combining them to implement complex behaviors specified in language. We evaluate MaestroMotif using a suite of complex tasks in the NetHack Learning Environment (NLE), demonstrating that it surpasses existing approaches in both performance and usability.

arxiv情報

著者	Martin Klissarov,Mikael Henaff,Roberta Raileanu,Shagun Sodhani,Pascal Vincent,Amy Zhang,Pierre-Luc Bacon,Doina Precup,Marlos C. Machado,Pierluca D’Oro
発行日	2024-12-11 16:59:31+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

MaestroMotif: Skill Design from Artificial Intelligence Feedback

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー