The Surprising Ineffectiveness of Pre-Trained Visual Representations for Model-Based Reinforcement Learning

要約

視覚強化学習 (RL) 手法では、多くの場合、大量のデータが必要になります。
モデルフリー RL とは対照的に、モデルベース RL (MBRL) は、計画を通じてデータを効率的に利用する潜在的なソリューションを提供します。
さらに、RL には現実世界のタスクに対する一般化機能がありません。
これまでの研究では、事前トレーニングされた視覚表現 (PVR) を組み込むとサンプルの効率と一般化が向上することが示されています。
PVR はモデルフリー RL の文脈で広く研究されてきましたが、MBRL における PVR の可能性はほとんど解明されていません。
このペーパーでは、モデルベースの RL 設定における困難な制御タスクについて、一連の PVR のベンチマークを行います。
データの効率、一般化機能、およびモデルベースのエージェントのパフォーマンスに対する PVR のさまざまなプロパティの影響を調査します。
私たちの結果は、おそらく驚くべきことですが、MBRL の場合、現在の PVR は表現を最初から学習するよりもサンプル効率が高くなく、分布外 (OOD) 設定に対してより適切に一般化するわけではないことを明らかにしています。
これを説明するために、トレーニングされたダイナミクスモデルの品質を分析します。
さらに、データの多様性とネットワークアーキテクチャが OOD の一般化パフォーマンスに最も重要な貢献者であることを示します。

要約(オリジナル)

Visual Reinforcement Learning (RL) methods often require extensive amounts of data. As opposed to model-free RL, model-based RL (MBRL) offers a potential solution with efficient data utilization through planning. Additionally, RL lacks generalization capabilities for real-world tasks. Prior work has shown that incorporating pre-trained visual representations (PVRs) enhances sample efficiency and generalization. While PVRs have been extensively studied in the context of model-free RL, their potential in MBRL remains largely unexplored. In this paper, we benchmark a set of PVRs on challenging control tasks in a model-based RL setting. We investigate the data efficiency, generalization capabilities, and the impact of different properties of PVRs on the performance of model-based agents. Our results, perhaps surprisingly, reveal that for MBRL current PVRs are not more sample efficient than learning representations from scratch, and that they do not generalize better to out-of-distribution (OOD) settings. To explain this, we analyze the quality of the trained dynamics model. Furthermore, we show that data diversity and network architecture are the most important contributors to OOD generalization performance.

arxiv情報

著者	Moritz Schneider,Robert Krug,Narunas Vaskevicius,Luigi Palmieri,Joschka Boedecker
発行日	2025-01-15 15:24:32+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

The Surprising Ineffectiveness of Pre-Trained Visual Representations for Model-Based Reinforcement Learning

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー