Equivariant Reinforcement Learning under Partial Observability

要約

誘導バイアスを組み込むことは、サンプル効率の高いソリューションで困難なロボット学習領域に取り組むための有望なアプローチです。
この論文では、対称性が効率的な学習に役立つ誘導バイアスとなり得る、部分的に観察可能な領域を特定します。
具体的には、特定のグループの対称性に関する等分散をニューラルネットワークにエンコードすることで、アクター批判型強化学習エージェントは、関連するシナリオに対して過去のソリューションを再利用できます。
その結果、シミュレーションと実際のハードウェアでのさまざまなロボットタスクの実験を通じて実証されたように、当社の等変エージェントはサンプル効率と最終パフォーマンスの点で非等変アプローチよりも大幅に優れています。

要約(オリジナル)

Incorporating inductive biases is a promising approach for tackling challenging robot learning domains with sample-efficient solutions. This paper identifies partially observable domains where symmetries can be a useful inductive bias for efficient learning. Specifically, by encoding the equivariance regarding specific group symmetries into the neural networks, our actor-critic reinforcement learning agents can reuse solutions in the past for related scenarios. Consequently, our equivariant agents outperform non-equivariant approaches significantly in terms of sample efficiency and final performance, demonstrated through experiments on a range of robotic tasks in simulation and real hardware.

arxiv情報

著者	Hai Nguyen,Andrea Baisero,David Klee,Dian Wang,Robert Platt,Christopher Amato
発行日	2024-08-26 15:07:01+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

Equivariant Reinforcement Learning under Partial Observability

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー