A Generalist Dynamics Model for Control

要約

私たちは、制御用のダイナミクスモデル (TDM) としての変圧器シーケンスモデルの使用を研究しています。
DeepMind コントロールスイートでの多くの実験では、まず、TDM がベースラインモデルと比較した場合、単一環境の学習設定で良好にパフォーマンスを発揮することがわかりました。
第 2 に、TDM は、ジェネラリストモデルがターゲット環境からの少量のデータで微調整される少数ショット設定と、ジェネラリストモデルが調整されるゼロショット設定の両方で、目に見えない環境に対する強力な汎化機能を示します。
追加のトレーニングなしで、目に見えない環境に適用されます。
さらに、システムダイナミクスの一般化は、最適な動作をポリシーとして直接一般化するよりもはるかに効果的に機能する可能性があることを示します。
このため、TDM は制御の基礎モデルとして有望な要素となります。

要約(オリジナル)

We investigate the use of transformer sequence models as dynamics models (TDMs) for control. In a number of experiments in the DeepMind control suite, we find that first, TDMs perform well in a single-environment learning setting when compared to baseline models. Second, TDMs exhibit strong generalization capabilities to unseen environments, both in a few-shot setting, where a generalist model is fine-tuned with small amounts of data from the target environment, and in a zero-shot setting, where a generalist model is applied to an unseen environment without any further training. We further demonstrate that generalizing system dynamics can work much better than generalizing optimal behavior directly as a policy. This makes TDMs a promising ingredient for a foundation model of control.

arxiv情報

著者	Ingmar Schubert,Jingwei Zhang,Jake Bruce,Sarah Bechtle,Emilio Parisotto,Martin Riedmiller,Jost Tobias Springenberg,Arunkumar Byravan,Leonard Hasenclever,Nicolas Heess
発行日	2023-05-18 12:14:49+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

A Generalist Dynamics Model for Control

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー