Reservoir Computing for Fast, Simplified Reinforcement Learning on Memory Tasks

要約

現在の観察セットでは利用できない過去の情報に報酬が依存するタスクは、短期記憶を備えたエージェントによってのみ解決できます。
メモリモジュールの通常の選択肢には、ゲートメモリを備えたトレーニング可能な反復隠れ層が含まれます。
リザーバーコンピューティングは、再帰層がトレーニングされず、一連の固定された疎な再帰重みを持つ代替手段を提供します。
重みは、リザーバ状態に入力の高次元の非線形インパルス応答関数が含まれるように、安定した動的動作を生成するようにスケーリングされます。
次に、出力デコーダネットワークを使用して、リザーバーの状態によって表される圧縮履歴を、エージェントのアクションや予測を含む任意の出力にマッピングできます。
この研究では、リザーバーコンピューティングが、(1) 経時的な勾配の逆伝播の必要性を排除し、(2) すべての最近の履歴を下流ネットワークに同時に提示すること、(3)
トレーニングされたモジュールの上流で多くの有用で一般的な非線形計算を実行します。
特に、これらの発見は、主に効率的で汎用性の高い記憶システムに依存するメタ学習に大きな利益をもたらします。

要約(オリジナル)

Tasks in which rewards depend upon past information not available in the current observation set can only be solved by agents that are equipped with short-term memory. Usual choices for memory modules include trainable recurrent hidden layers, often with gated memory. Reservoir computing presents an alternative, in which a recurrent layer is not trained, but rather has a set of fixed, sparse recurrent weights. The weights are scaled to produce stable dynamical behavior such that the reservoir state contains a high-dimensional, nonlinear impulse response function of the inputs. An output decoder network can then be used to map the compressive history represented by the reservoir’s state to any outputs, including agent actions or predictions. In this study, we find that reservoir computing greatly simplifies and speeds up reinforcement learning on memory tasks by (1) eliminating the need for backpropagation of gradients through time, (2) presenting all recent history simultaneously to the downstream network, and (3) performing many useful and generic nonlinear computations upstream from the trained modules. In particular, these findings offer significant benefit to meta-learning that depends primarily on efficient and highly general memory systems.

arxiv情報

著者	Kevin McKee
発行日	2024-12-17 17:02:06+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

Reservoir Computing for Fast, Simplified Reinforcement Learning on Memory Tasks

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー