Umbrella Reinforcement Learning — computationally efficient tool for hard non-linear problems

要約

強化学習 (RL) の難しい非線形問題を解決するための、計算効率の高い新しいアプローチを報告します。
ここでは、計算物理学/化学からのアンブレラサンプリングと最適な制御方法を組み合わせます。
このアプローチは、ポリシー勾配を使用したニューラルネットワークに基づいて実現されます。
これは、計算効率と実装の普遍性により、スパース報酬、状態トラップ、終端状態の欠如を伴うハード RL 問題への適用において、利用可能なすべての最先端アルゴリズムよりも優れたパフォーマンスを発揮します。
提案されたアプローチは、同時に動作するエージェントのアンサンブルを使用し、アンサンブルのエントロピーを含む修正された報酬を使用して、最適な探索と活用のバランスをもたらします。

要約(オリジナル)

We report a novel, computationally efficient approach for solving hard nonlinear problems of reinforcement learning (RL). Here we combine umbrella sampling, from computational physics/chemistry, with optimal control methods. The approach is realized on the basis of neural networks, with the use of policy gradient. It outperforms, by computational efficiency and implementation universality, all available state-of-the-art algorithms, in application to hard RL problems with sparse reward, state traps and lack of terminal states. The proposed approach uses an ensemble of simultaneously acting agents, with a modified reward which includes the ensemble entropy, yielding an optimal exploration-exploitation balance.

arxiv情報

著者	Egor E. Nuzhin,Nikolai V. Brilliantov
発行日	2024-11-21 13:34:36+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

Umbrella Reinforcement Learning — computationally efficient tool for hard non-linear problems

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー