Knowledge-based Reasoning and Learning under Partial Observability in Ad Hoc Teamwork

要約

アドホックチームワークとは、エージェントが事前の調整なしでチームメイトと共同作業できるようにする問題を指します。
データ駆動型の手法は、アドホックなチームワークにおける最先端の手法を表します。
彼らは、以前の観察のラベル付けされた大規模なデータセットを使用して、他のタイプのエージェントの動作をモデル化し、アドホックエージェントの動作を決定します。
これらの方法は計算コストが高く、透明性に欠けており、チーム構成など、これまで目に見えなかった変化に適応することが困難になります。
私たちの最近の研究では、事前の常識的な領域知識と限られた例から学習した他のエージェントの行動の予測モデルによる非単調な論理的推論に基づいてアドホックエージェントの行動を決定するアーキテクチャを導入しました。
この論文では、以下をサポートするためにアーキテクチャの機能を大幅に拡張します。(a) 他のエージェントの動作を予測するモデルのオンライン選択、適応、学習。
(b) 部分的な可観測性と限られたコミュニケーションが存在する中でのチームメイトとのコラボレーション。
アドホックチームワーク用の 2 つのシミュレートされたマルチエージェントベンチマークドメイン、フォートアタックとハーフフィールドオフェンスでアーキテクチャの機能を示し、実験的に評価します。
私たちのアーキテクチャのパフォーマンスは、単純なシナリオと複雑なシナリオの両方で、特に限られたトレーニングデータ、部分的な可観測性、チーム構成の変化が存在する場合に、最先端のデータ駆動型ベースラインと同等かそれ以上であることを示します。

要約(オリジナル)

Ad hoc teamwork refers to the problem of enabling an agent to collaborate with teammates without prior coordination. Data-driven methods represent the state of the art in ad hoc teamwork. They use a large labeled dataset of prior observations to model the behavior of other agent types and to determine the ad hoc agent’s behavior. These methods are computationally expensive, lack transparency, and make it difficult to adapt to previously unseen changes, e.g., in team composition. Our recent work introduced an architecture that determined an ad hoc agent’s behavior based on non-monotonic logical reasoning with prior commonsense domain knowledge and predictive models of other agents’ behavior that were learned from limited examples. In this paper, we substantially expand the architecture’s capabilities to support: (a) online selection, adaptation, and learning of the models that predict the other agents’ behavior; and (b) collaboration with teammates in the presence of partial observability and limited communication. We illustrate and experimentally evaluate the capabilities of our architecture in two simulated multiagent benchmark domains for ad hoc teamwork: Fort Attack and Half Field Offense. We show that the performance of our architecture is comparable or better than state of the art data-driven baselines in both simple and complex scenarios, particularly in the presence of limited training data, partial observability, and changes in team composition.

arxiv情報

著者	Hasra Dodampegama,Mohan Sridharan
発行日	2023-06-01 15:21:27+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

Knowledge-based Reasoning and Learning under Partial Observability in Ad Hoc Teamwork

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー