Characterizing Speed Performance of Multi-Agent Reinforcement Learning

要約

マルチエージェント強化学習 (MARL) は、大規模 AI システムや、スマートグリッド、監視などのビッグデータアプリケーションで大きな成功を収めています。MARL アルゴリズムの既存の進歩は、相互作用のためのさまざまなメカニズムを導入することで得られる報酬の向上に重点を置いています。
エージェントの協力。
ただし、これらの最適化は通常、コンピューティングとメモリを大量に消費するため、エンドツーエンドのトレーニング時間における速度パフォーマンスが最適以下になります。
この研究では、MARL 実装の主要な指標として速度パフォーマンス (つまり、遅延制限付きスループット) を分析します。
具体的には、まず、加速の観点から、(1)トレーニングスキームと(2)通信方法に分類されたMARLアルゴリズムの分類を紹介します。
私たちの分類法を使用して、マルチエージェントディープ決定論的ポリシー勾配 (MADDPG)、ターゲット指向マルチエージェント通信と協力 (ToM2C)、およびネットワーク化マルチエージェント RL (NeurComm) という 3 つの最先端の MARL アルゴリズムを特定します。
– ターゲットベンチマークアルゴリズムとして使用し、同種のマルチコア CPU プラットフォームでのパフォーマンスのボトルネックを体系的に分析します。
私たちは、並列化と高速化の機会にも対処しながら、将来の文献では MARL レイテンシ制限付きスループットが重要なパフォーマンス指標となる必要性を正当化します。

要約(オリジナル)

Multi-Agent Reinforcement Learning (MARL) has achieved significant success in large-scale AI systems and big-data applications such as smart grids, surveillance, etc. Existing advancements in MARL algorithms focus on improving the rewards obtained by introducing various mechanisms for inter-agent cooperation. However, these optimizations are usually compute- and memory-intensive, thus leading to suboptimal speed performance in end-to-end training time. In this work, we analyze the speed performance (i.e., latency-bounded throughput) as the key metric in MARL implementations. Specifically, we first introduce a taxonomy of MARL algorithms from an acceleration perspective categorized by (1) training scheme and (2) communication method. Using our taxonomy, we identify three state-of-the-art MARL algorithms – Multi-Agent Deep Deterministic Policy Gradient (MADDPG), Target-oriented Multi-agent Communication and Cooperation (ToM2C), and Networked Multi-Agent RL (NeurComm) – as target benchmark algorithms, and provide a systematic analysis of their performance bottlenecks on a homogeneous multi-core CPU platform. We justify the need for MARL latency-bounded throughput to be a key performance metric in future literature while also addressing opportunities for parallelization and acceleration.

arxiv情報

著者	Samuel Wiggins,Yuan Meng,Rajgopal Kannan,Viktor Prasanna
発行日	2023-09-13 17:26:36+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

Characterizing Speed Performance of Multi-Agent Reinforcement Learning

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー