Impoola: The Power of Average Pooling for Image-Based Deep Reinforcement Learning

要約

画像ベースのディープ強化学習がより挑戦的なタスクに取り組むにつれて、モデルサイズの増加がパフォーマンスを改善する重要な要素になりました。
最近の研究では、通常、15層のResNetにインスパイアされたネットワークであるImpala-CNNを画像エンコーダとして使用して、スケーリングされたネットワークのパラメーター効率に焦点を当てることでこれを達成しました。
ただし、Impala-CNNは明らかに古いCNNアーキテクチャよりも優れていますが、深い補強学習固有の画像エンコーダーのためのネットワーク設計の潜在的な進歩はほとんど未開拓のままです。
Impala-CNNの出力機能マップの平坦化をグローバルな平均プーリングに置き換えると、顕著なパフォーマンスが向上することがわかります。
このアプローチは、特に一般化の観点から、Procgenベンチマーク内のより大きく複雑なモデルよりも優れています。
提案されているエンコーダーモデルImpoola-CNNを呼び出します。
エージェント中心の観察なしでゲームの最も重要な利益を観察するため、ネットワークの翻訳感度の低下は、この改善の中心になる可能性があります。
私たちの結果は、ネットワークスケーリングがモデルサイズの増加だけではないことを示しています。効率的なネットワーク設計も重要な要素です。

要約(オリジナル)

As image-based deep reinforcement learning tackles more challenging tasks, increasing model size has become an important factor in improving performance. Recent studies achieved this by focusing on the parameter efficiency of scaled networks, typically using Impala-CNN, a 15-layer ResNet-inspired network, as the image encoder. However, while Impala-CNN evidently outperforms older CNN architectures, potential advancements in network design for deep reinforcement learning-specific image encoders remain largely unexplored. We find that replacing the flattening of output feature maps in Impala-CNN with global average pooling leads to a notable performance improvement. This approach outperforms larger and more complex models in the Procgen Benchmark, particularly in terms of generalization. We call our proposed encoder model Impoola-CNN. A decrease in the network’s translation sensitivity may be central to this improvement, as we observe the most significant gains in games without agent-centered observations. Our results demonstrate that network scaling is not just about increasing model size – efficient network design is also an essential factor.

arxiv情報

著者	Raphael Trumpp,Ansgar Schäfftlein,Mirco Theile,Marco Caccamo
発行日	2025-03-07 16:19:19+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

Impoola: The Power of Average Pooling for Image-Based Deep Reinforcement Learning

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー