Diffusion Policy: Visuomotor Policy Learning via Action Diffusion

要約

この論文では、ロボットの視覚運動ポリシーを条件付きノイズ除去拡散プロセスとして表すことによってロボットの動作を生成する新しい方法である拡散ポリシーを紹介します。
4 つの異なるロボット操作ベンチマークから 12 の異なるタスクにわたる拡散ポリシーのベンチマークを行ったところ、既存の最先端のロボット学習方法を常に上回っており、平均 46.9% の改善が見られました。
Diffusion Policy は、アクション分布スコア関数の勾配を学習し、一連の確率的ランジュバン力学ステップによる推論中にこの勾配フィールドに関して繰り返し最適化します。
私たちは、拡散定式化がロボットポリシーに使用されると、マルチモーダルなアクション分布を適切に処理すること、高次元のアクション空間に適していること、優れたトレーニング安定性を示すことなど、強力な利点を生み出すことがわかりました。
物理ロボット上での視覚運動政策学習のための拡散モデルの可能性を完全に解き放つために、この論文では、後退地平線制御、視覚調整、時系列拡散トランスフォーマーの組み込みを含む一連の主要な技術的貢献を紹介します。
私たちは、この研究が、拡散モデルの強力な生成モデリング機能を活用できる新世代の政策学習手法の動機付けに役立つことを願っています。
コード、データ、トレーニングの詳細は一般に公開されます。

要約(オリジナル)

This paper introduces Diffusion Policy, a new way of generating robot behavior by representing a robot’s visuomotor policy as a conditional denoising diffusion process. We benchmark Diffusion Policy across 12 different tasks from 4 different robot manipulation benchmarks and find that it consistently outperforms existing state-of-the-art robot learning methods with an average improvement of 46.9%. Diffusion Policy learns the gradient of the action-distribution score function and iteratively optimizes with respect to this gradient field during inference via a series of stochastic Langevin dynamics steps. We find that the diffusion formulation yields powerful advantages when used for robot policies, including gracefully handling multimodal action distributions, being suitable for high-dimensional action spaces, and exhibiting impressive training stability. To fully unlock the potential of diffusion models for visuomotor policy learning on physical robots, this paper presents a set of key technical contributions including the incorporation of receding horizon control, visual conditioning, and the time-series diffusion transformer. We hope this work will help motivate a new generation of policy learning techniques that are able to leverage the powerful generative modeling capabilities of diffusion models. Code, data, and training details will be publicly available.

arxiv情報

著者	Cheng Chi,Siyuan Feng,Yilun Du,Zhenjia Xu,Eric Cousineau,Benjamin Burchfiel,Shuran Song
発行日	2023-06-01 15:27:43+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

Diffusion Policy: Visuomotor Policy Learning via Action Diffusion

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー