A General Perspective on Objectives of Reinforcement Learning

要約

この講義では、強化学習 (RL) の目標に関する一般的な観点を示し、目標の 3 つのバージョンを示します。
最初のバージョンは、RL 文献における目的の標準定義です。
次に、標準定義を $\lambda$-return バージョンに拡張し、目的の標準定義を統一します。
最後に、前の 2 つのバージョンを統合する一般的な目的を提案します。
最後のバージョンは、RL の目的を理解するための高レベルを提供します。そこでは、広く使用されているいくつかの RL 手法 (TD$(\lambda)$ や GAE など) を結びつける基本的な定式化が示されており、この目的は広範な RL に適用できる可能性があります。
アルゴリズム。

要約(オリジナル)

In this lecture, we present a general perspective on reinforcement learning (RL) objectives, where we show three versions of objectives. The first version is the standard definition of objective in RL literature. Then we extend the standard definition to the $\lambda$-return version, which unifies the standard definition of objective. Finally, we propose a general objective that unifies the previous two versions. The last version provides a high level to understand of RL’s objective, where it shows a fundamental formulation that connects some widely used RL techniques (e.g., TD$(\lambda)$ and GAE), and this objective can be potentially applied to extensive RL algorithms.

arxiv情報

著者	Long Yang
発行日	2023-06-05 17:50:29+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

A General Perspective on Objectives of Reinforcement Learning

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー