Deep Reinforcement Learning with Multitask Episodic Memory Based on Task-Conditioned Hypernetwork

要約

深層強化学習アルゴリズムは通常、正確な意思決定能力を獲得するための環境との複数の相互作用に大きく依存するサンプリングの非効率によって妨げられます。
対照的に、人間は海馬に依存して、関連するタスクの過去の経験から関連情報を取得します。これは、新しいタスクを学習する際の意思決定の指針となり、環境の相互作用だけに依存するのではありません。
それにもかかわらず、エージェントが過去の経験を確立された強化学習アルゴリズムに組み込むための海馬のようなモジュールを設計することには、2 つの課題があります。
最初の課題には、現在のタスクに最も関連性のある過去の経験を選択することが含まれ、2 番目の課題は、そのような経験を意思決定ネットワークに統合することです。
これらの課題に対処するために、我々は、タスクに応じて検索ネットワークのパラメータを適応させる、タスク条件付きハイパーネットワークに基づく検索ネットワークを利用する新しい方法を提案します。
同時に、動的変更メカニズムにより、検索ネットワークと決定ネットワーク間の共同作業が強化されます。
提案手法を MiniGrid 環境で評価します。実験結果は、提案手法が強力なベースラインを大幅に上回ることを示しています。

要約(オリジナル)

Deep reinforcement learning algorithms are usually impeded by sampling inefficiency, heavily depending on multiple interactions with the environment to acquire accurate decision-making capabilities. In contrast, humans rely on their hippocampus to retrieve relevant information from past experiences of relevant tasks, which guides their decision-making when learning a new task, rather than exclusively depending on environmental interactions. Nevertheless, designing a hippocampus-like module for an agent to incorporate past experiences into established reinforcement learning algorithms presents two challenges. The first challenge involves selecting the most relevant past experiences for the current task, and the second challenge is integrating such experiences into the decision network. To address these challenges, we propose a novel method that utilizes a retrieval network based on task-conditioned hypernetwork, which adapts the retrieval network’s parameters depending on the task. At the same time, a dynamic modification mechanism enhances the collaborative efforts between the retrieval and decision networks. We evaluate the proposed method on the MiniGrid environment.The experimental results demonstrate that our proposed method significantly outperforms strong baselines.

arxiv情報

著者	Yonggang Jin,Chenxu Wang,Liuyu Xiang,Yaodong Yang,Junge Zhang,Jie Fu,Zhaofeng He
発行日	2023-08-16 02:11:16+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

Deep Reinforcement Learning with Multitask Episodic Memory Based on Task-Conditioned Hypernetwork

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー