A Free Lunch from the Noise: Provable and Practical Exploration for Representation Learning


ただし、表現学習の力は、強化学習 (RL) ではまだ十分に活用されていません。
ii) 探索と表現学習の結合。
この観察に基づいて、スペクトル ダイナミクス エンベディング (SPEDE) を提案します。これは、トレードオフを打破し、ノイズの構造を利用して表現学習の楽観的な探索を完了します。
SPEDE の厳密な理論的分析を提供し、いくつかのベンチマークで既存の最先端の経験的アルゴリズムよりも実際に優れたパフォーマンスを示しています。


Representation learning lies at the heart of the empirical success of deep learning for dealing with the curse of dimensionality. However, the power of representation learning has not been fully exploited yet in reinforcement learning (RL), due to i), the trade-off between expressiveness and tractability; and ii), the coupling between exploration and representation learning. In this paper, we first reveal the fact that under some noise assumption in the stochastic control model, we can obtain the linear spectral feature of its corresponding Markov transition operator in closed-form for free. Based on this observation, we propose Spectral Dynamics Embedding (SPEDE), which breaks the trade-off and completes optimistic exploration for representation learning by exploiting the structure of the noise. We provide rigorous theoretical analysis of SPEDE, and demonstrate the practical superior performance over the existing state-of-the-art empirical algorithms on several benchmarks.


著者 Tongzheng Ren,Tianjun Zhang,Csaba Szepesvári,Bo Dai
発行日 2023-03-07 16:34:59+00:00
カテゴリー: cs.AI, cs.LG, stat.ML