Scalable spectral representations for multi-agent reinforcement learning in network MDPs

要約

マルチエージェント制御の一般的なモデルであるネットワークマルコフ決定プロセス (MDP) は、エージェントの数に応じてグローバルな状態アクション空間が指数関数的に増加するため、効率的な学習に重大な課題をもたらします。
この研究では、ネットワークダイナミクスの指数関数的減衰特性を利用して、最初にネットワーク MDP のスケーラブルなスペクトルローカル表現を導出します。これにより、各エージェントのローカル $Q$ 関数のネットワーク線形部分空間が誘導されます。
これらのローカルスペクトル表現に基づいて、連続ステートアクションネットワーク MDP 用のスケーラブルなアルゴリズムフレームワークを設計し、アルゴリズムの収束に対するエンドツーエンドの保証を提供します。
我々は、2 つのベンチマーク問題に対するスケーラブルな表現ベースのアプローチの有効性を経験的に検証し、ローカル $Q$ 関数を表現する一般的な関数近似アプローチと比較したこのアプローチの利点を実証します。

要約(オリジナル)

Network Markov Decision Processes (MDPs), a popular model for multi-agent control, pose a significant challenge to efficient learning due to the exponential growth of the global state-action space with the number of agents. In this work, utilizing the exponential decay property of network dynamics, we first derive scalable spectral local representations for network MDPs, which induces a network linear subspace for the local $Q$-function of each agent. Building on these local spectral representations, we design a scalable algorithmic framework for continuous state-action network MDPs, and provide end-to-end guarantees for the convergence of our algorithm. Empirically, we validate the effectiveness of our scalable representation-based approach on two benchmark problems, and demonstrate the advantages of our approach over generic function approximation approaches to representing the local $Q$-functions.

arxiv情報

著者	Zhaolin Ren,Runyu Zhang,Bo Dai,Na Li
発行日	2024-11-18 15:21:40+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

Scalable spectral representations for multi-agent reinforcement learning in network MDPs

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー