A Free Lunch from the Noise: Provable and Practical Exploration for Representation Learning

要約

表現学習は、次元の呪いに対処するための深層学習の経験的成功の中心にあります。
ただし、表現学習の力は、強化学習 (RL) ではまだ十分に活用されていません。
ii) 探索と表現学習の結合。
この論文では、確率制御モデルのノイズ仮定の下で、対応するマルコフ遷移演算子の線形スペクトル特徴を閉形式で無料で取得できるという事実を最初に明らかにします。
この観察に基づいて、スペクトルダイナミクスエンベディング (SPEDE) を提案します。これは、トレードオフを打破し、ノイズの構造を利用して表現学習の楽観的な探索を完了します。
SPEDE の厳密な理論的分析を提供し、いくつかのベンチマークで既存の最先端の経験的アルゴリズムよりも実際に優れたパフォーマンスを示しています。

要約(オリジナル)

Representation learning lies at the heart of the empirical success of deep learning for dealing with the curse of dimensionality. However, the power of representation learning has not been fully exploited yet in reinforcement learning (RL), due to i), the trade-off between expressiveness and tractability; and ii), the coupling between exploration and representation learning. In this paper, we first reveal the fact that under some noise assumption in the stochastic control model, we can obtain the linear spectral feature of its corresponding Markov transition operator in closed-form for free. Based on this observation, we propose Spectral Dynamics Embedding (SPEDE), which breaks the trade-off and completes optimistic exploration for representation learning by exploiting the structure of the noise. We provide rigorous theoretical analysis of SPEDE, and demonstrate the practical superior performance over the existing state-of-the-art empirical algorithms on several benchmarks.

arxiv情報

著者	Tongzheng Ren,Tianjun Zhang,Csaba Szepesvári,Bo Dai
発行日	2023-03-07 16:34:59+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

A Free Lunch from the Noise: Provable and Practical Exploration for Representation Learning

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー