Does Self-supervised Learning Really Improve Reinforcement Learning from Pixels?

要約

自己監視学習（SSL）がピクセルからのオンライン強化学習（RL）を改善できるかどうかを調査します。
SSLとRLの損失を共同で最適化する対照的な強化学習フレームワーク（CURLなど）を拡張し、さまざまな自己監視損失を使用して大量の実験を実施します。
私たちの観察によれば、RLの既存のSSLフレームワークは、同じ量のデータと拡張が使用されている場合にのみ、画像の拡張を利用するだけで、ベースラインに対して有意義な改善をもたらすことができません。
さらに、RLの複数の自己監視損失の最適な組み合わせを見つけるために進化的検索を実行しますが、そのような損失の組み合わせでさえ、注意深く設計された画像拡張のみを利用する方法を有意に上回ることができないことがわかります。
多くの場合、既存のフレームワークの下で自己監視損失を使用すると、RLのパフォーマンスが低下しました。
実世界のロボット環境を含む複数の異なる環境でアプローチを評価し、単一の自己監視損失または画像拡張方法がすべての環境を支配できるわけではなく、SSLとRLの共同最適化の現在のフレームワークが制限されていることを確認します。
最後に、SSL + RLの事前トレーニングフレームワークと、さまざまなアプローチで学習した表現のプロパティを経験的に調査します。

要約(オリジナル)

We investigate whether self-supervised learning (SSL) can improve online reinforcement learning (RL) from pixels. We extend the contrastive reinforcement learning framework (e.g., CURL) that jointly optimizes SSL and RL losses and conduct an extensive amount of experiments with various self-supervised losses. Our observations suggest that the existing SSL framework for RL fails to bring meaningful improvement over the baselines only taking advantage of image augmentation when the same amount of data and augmentation is used. We further perform an evolutionary search to find the optimal combination of multiple self-supervised losses for RL, but find that even such a loss combination fails to meaningfully outperform the methods that only utilize carefully designed image augmentations. Often, the use of self-supervised losses under the existing framework lowered RL performances. We evaluate the approach in multiple different environments including a real-world robot environment and confirm that no single self-supervised loss or image augmentation method can dominate all environments and that the current framework for joint optimization of SSL and RL is limited. Finally, we empirically investigate the pretraining framework for SSL + RL and the properties of representations learned with different approaches.

arxiv情報

著者	Xiang Li,Jinghuan Shang,Srijan Das,Michael S. Ryoo
発行日	2022-06-10 17:59:30+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

Does Self-supervised Learning Really Improve Reinforcement Learning from Pixels?

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー