Advantage Alignment Algorithms

要約

人工的にインテリジェントなエージェントは、大規模な言語モデル（LLM）アシスタントから自律車両まで、人間の意思決定にますます統合されています。
これらのシステムは、多くの場合、個々の目的を最適化し、特に素朴な補強学習エージェントがパレートサブプチマルナッシュ平衡に経験的に収束する一般的なゲームで紛争につながります。
この問題に対処するために、対戦相手の形成は、一般的なゲームで社会的に有益な平衡を見つけるためのパラダイムとして浮上しています。
この作業では、Advantage Alignmentを紹介します。これは、相手を効率的かつ直感的に形作る最初の原則から派生したアルゴリズムのファミリーです。
相互作用エージェントの利点を調整することでこれを達成し、相互作用が肯定的である場合に相互に有益なアクションの確率を高めます。
既存の対戦相手の形成方法は、暗黙的にアドバンテージアラインメントを実行することを証明します。
これらの方法と比較して、アドバンテージアラインメントは、相手の形成の数学的定式化を簡素化し、計算負荷を減らし、連続的なアクションドメインに拡張します。
私たちは、さまざまな社会的ジレンマにわたるアルゴリズムの有効性を実証し、最先端の協力と搾取に対する堅牢性を達成します。

要約(オリジナル)

Artificially intelligent agents are increasingly being integrated into human decision-making: from large language model (LLM) assistants to autonomous vehicles. These systems often optimize their individual objective, leading to conflicts, particularly in general-sum games where naive reinforcement learning agents empirically converge to Pareto-suboptimal Nash equilibria. To address this issue, opponent shaping has emerged as a paradigm for finding socially beneficial equilibria in general-sum games. In this work, we introduce Advantage Alignment, a family of algorithms derived from first principles that perform opponent shaping efficiently and intuitively. We achieve this by aligning the advantages of interacting agents, increasing the probability of mutually beneficial actions when their interaction has been positive. We prove that existing opponent shaping methods implicitly perform Advantage Alignment. Compared to these methods, Advantage Alignment simplifies the mathematical formulation of opponent shaping, reduces the computational burden and extends to continuous action domains. We demonstrate the effectiveness of our algorithms across a range of social dilemmas, achieving state-of-the-art cooperation and robustness against exploitation.

arxiv情報

著者	Juan Agustin Duque,Milad Aghajohari,Tim Cooijmans,Razvan Ciuca,Tianyu Zhang,Gauthier Gidel,Aaron Courville
発行日	2025-02-06 18:12:57+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

Advantage Alignment Algorithms

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー