Deep Reinforcement Learning with Multitask Episodic Memory Based on Task-Conditioned Hypernetwork


それにもかかわらず、エージェントが過去の経験を確立された強化学習アルゴリズムに組み込むための海馬のようなモジュールを設計することには、2 つの課題があります。
最初の課題には、現在のタスクに最も関連性のある過去の経験を選択することが含まれ、2 番目の課題は、そのような経験を意思決定ネットワークに統合することです。
提案されたアルゴリズムを困難な MiniGrid 環境で評価します。


Deep reinforcement learning algorithms are usually impeded by sampling inefficiency, heavily depending on multiple interactions with the environment to acquire accurate decision-making capabilities. In contrast, humans seem to rely on their hippocampus to retrieve relevant information from past experiences of relevant tasks, which guides their decision-making when learning a new task, rather than exclusively depending on environmental interactions. Nevertheless, designing a hippocampus-like module for an agent to incorporate past experiences into established reinforcement learning algorithms presents two challenges. The first challenge involves selecting the most relevant past experiences for the current task, and the second is integrating such experiences into the decision network. To address these challenges, we propose a novel algorithm that utilizes a retrieval network based on a task-conditioned hypernetwork, which adapts the retrieval network’s parameters depending on the task. At the same time, a dynamic modification mechanism enhances the collaborative efforts between the retrieval and decision networks. We evaluate the proposed algorithm on the challenging MiniGrid environment. The experimental results demonstrate that our proposed method significantly outperforms strong baselines.


著者 Yonggang Jin,Chenxu Wang,Liuyu Xiang,Yaodong Yang,Jie Fu,Zhaofeng He
発行日 2023-06-21 08:41:27+00:00
arxivサイト arxiv_id(pdf)

提供元, 利用サービス, Google

カテゴリー: cs.AI, cs.LG パーマリンク