Recurrent Memory Decision Transformer

要約

トランスフォーマーモデルは、もともと自然言語の問題のために開発されましたが、最近ではオフラインの強化学習タスクで広く使用されています。
これは、エージェントの履歴をシーケンスとして表現でき、タスク全体をシーケンスモデリングタスクに還元できるためです。
ただし、トランスフォーマー操作の 2 次の複雑さにより、コンテキスト内の潜在的な増加が制限されます。
したがって、自然言語の長いシーケンスを処理するには、さまざまなバージョンのメモリメカニズムが使用されます。
この論文では、強化学習問題にリカレントメモリメカニズムを使用するモデルであるリカレントメモリデシジョントランスフォーマー (RMDT) を提案します。
私たちは、Atari ゲームと MuJoCo 制御の問題について徹底的な実験を行い、私たちが提案したモデルが、Atari ゲームのリカレントメモリメカニズムを持たない対応モデルよりも大幅に優れていることを示しました。
また、提案されたモデルのパフォーマンスに対するメモリの影響も注意深く研究します。
これらの発見は、オフライン強化学習タスクにおける大規模なトランスフォーマーモデルのパフォーマンスを向上させるためにリカレントメモリメカニズムを組み込む可能性を明らかにします。
Recurrent Memory Decision Transformer のコードは、リポジトリ \url{https://anonymous.4open.science/r/RMDT-4FE4} で公開されています。

要約(オリジナル)

Originally developed for natural language problems, transformer models have recently been widely used in offline reinforcement learning tasks. This is because the agent’s history can be represented as a sequence, and the whole task can be reduced to the sequence modeling task. However, the quadratic complexity of the transformer operation limits the potential increase in context. Therefore, different versions of the memory mechanism are used to work with long sequences in a natural language. This paper proposes the Recurrent Memory Decision Transformer (RMDT), a model that uses a recurrent memory mechanism for reinforcement learning problems. We conduct thorough experiments on Atari games and MuJoCo control problems and show that our proposed model is significantly superior to its counterparts without the recurrent memory mechanism on Atari games. We also carefully study the effect of memory on the performance of the proposed model. These findings shed light on the potential of incorporating recurrent memory mechanisms to improve the performance of large-scale transformer models in offline reinforcement learning tasks. The Recurrent Memory Decision Transformer code is publicly available in the repository \url{https://anonymous.4open.science/r/RMDT-4FE4}.

arxiv情報

著者	Arkadii Bessonov,Alexey Staroverov,Huzhenyu Zhang,Alexey K. Kovalev,Dmitry Yudin,Aleksandr I. Panov
発行日	2023-07-05 06:20:35+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

Recurrent Memory Decision Transformer

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー