Tackling Cooperative Incompatibility for Zero-Shot Human-AI Coordination

要約

未知の人間との共同作業を効率的に行うAIエージェントの開発を目指す「ゼロショット人間-AI連携」において、未知の人間を含むシナリオにおける人間と人工知能の連携を実現することは、依然として大きな障害となっている。従来のアルゴリズムは、集団の中で固定した目的を最適化することで、戦略や行動の多様性を育み、人間との協調を図ることを目的としていました。しかし、これらの手法では、学習ロスが生じたり、集団内の特定の戦略と協調できなくなったりすることがあり、協調的非互換性と呼ばれる現象が発生します。この問題を軽減するために、我々は、2人のプレイヤーによる協力型ゲームにおいて、グラフ理論の観点を用いてオープンエンドな目標を設定し、各戦略の協力能力を評価・特定するCOLE (Cooperative Open-ended LEarning) フレームワークを紹介します。ゲーム理論やグラフ理論（シャプレー値や中心性など）の知見を取り入れた実用的なアルゴリズムを提唱しています。また、理論的・実証的な分析から、COLEが効果的に協力的な非互換性を克服できることを明らかにした。その後、アンケートやモデルの重みなどを簡単にカスタマイズできる、オンラインのOvercooked人間-AI実験プラットフォーム「COLEプラットフォーム」を構築しました。COLEプラットフォームを活用し、130名の参加者を募り、人体実験を行った。その結果、様々な主観的指標を用いて、最先端の手法よりも我々のアプローチが好まれることが明らかになりました。さらに、Overcookedゲーム環境での客観的な実験結果から、これまで遭遇したことのないAIエージェントや人間の代理モデルと協調する場合、我々の手法が既存の手法を凌駕することが示された。我々のコードとデモは、https://sites.google.com/view/cole-2023 で一般に公開されています。

要約(オリジナル)

Achieving coordination between humans and artificial intelligence in scenarios involving previously unencountered humans remains a substantial obstacle within Zero-Shot Human-AI Coordination, which aims to develop AI agents capable of efficiently working alongside previously unknown human teammates. Traditional algorithms have aimed to collaborate with humans by optimizing fixed objectives within a population, fostering diversity in strategies and behaviors. However, these techniques may lead to learning loss and an inability to cooperate with specific strategies within the population, a phenomenon named cooperative incompatibility. To mitigate this issue, we introduce the Cooperative Open-ended LEarning (COLE) framework, which formulates open-ended objectives in cooperative games with two players using perspectives of graph theory to evaluate and pinpoint the cooperative capacity of each strategy. We put forth a practical algorithm incorporating insights from game theory and graph theory, e.g., Shapley Value and Centrality. We also show that COLE could effectively overcome the cooperative incompatibility from theoretical and empirical analysis. Subsequently, we created an online Overcooked human-AI experiment platform, the COLE platform, which enables easy customization of questionnaires, model weights, and other aspects. Utilizing the COLE platform, we enlist 130 participants for human experiments. Our findings reveal a preference for our approach over state-of-the-art methods using a variety of subjective metrics. Moreover, objective experimental outcomes in the Overcooked game environment indicate that our method surpasses existing ones when coordinating with previously unencountered AI agents and the human proxy model. Our code and demo are publicly available at https://sites.google.com/view/cole-2023.

arxiv情報

著者	Yang Li,Shao Zhang,Jichen Sun,Wenhao Zhang,Yali Du,Ying Wen,Xinbing Wang,Wei Pan
発行日	2023-06-05 16:51:38+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, DeepL

Tackling Cooperative Incompatibility for Zero-Shot Human-AI Coordination

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー