Diffusion Policy: Visuomotor Policy Learning via Action Diffusion

要約

この論文では、ロボットの視覚運動ポリシーを条件付きノイズ除去拡散プロセスとして表すことにより、ロボットの動作を生成する新しい方法である拡散ポリシーを紹介します。
4 つの異なるロボット操作ベンチマークから 11 の異なるタスクにわたって Diffusion Policy をベンチマークし、平均 46.9% の改善で、既存の最先端のロボット学習方法より一貫して優れていることがわかりました。
拡散ポリシーは、アクション分布スコア関数の勾配を学習し、一連の確率的ランジュバンダイナミクスステップを介して、推論中にこの勾配フィールドに関して反復的に最適化します。
ロボットポリシーに使用すると、マルチモーダルアクション分布を適切に処理する、高次元のアクション空間に適している、印象的なトレーニングの安定性を示すなど、拡散定式化が強力な利点をもたらすことがわかりました。
物理ロボットの視覚運動政策学習のための拡散モデルの可能性を完全に解き放つために、この論文では、後退水平線制御、視覚調節、および時系列拡散変換器の組み込みを含む一連の主要な技術的貢献を提示します。
この作業が、拡散モデルの強力な生成モデリング機能を活用できる新世代のポリシー学習手法の動機付けに役立つことを願っています。
コード、データ、トレーニングの詳細は公開されます。

要約(オリジナル)

This paper introduces Diffusion Policy, a new way of generating robot behavior by representing a robot’s visuomotor policy as a conditional denoising diffusion process. We benchmark Diffusion Policy across 11 different tasks from 4 different robot manipulation benchmarks and find that it consistently outperforms existing state-of-the-art robot learning methods with an average improvement of 46.9%. Diffusion Policy learns the gradient of the action-distribution score function and iteratively optimizes with respect to this gradient field during inference via a series of stochastic Langevin dynamics steps. We find that the diffusion formulation yields powerful advantages when used for robot policies, including gracefully handling multimodal action distributions, being suitable for high-dimensional action spaces, and exhibiting impressive training stability. To fully unlock the potential of diffusion models for visuomotor policy learning on physical robots, this paper presents a set of key technical contributions including the incorporation of receding horizon control, visual conditioning, and the time-series diffusion transformer. We hope this work will help motivate a new generation of policy learning techniques that are able to leverage the powerful generative modeling capabilities of diffusion models. Code, data, and training details will be publicly available.

arxiv情報

著者	Cheng Chi,Siyuan Feng,Yilun Du,Zhenjia Xu,Eric Cousineau,Benjamin Burchfiel,Shuran Song
発行日	2023-03-07 18:50:03+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

Diffusion Policy: Visuomotor Policy Learning via Action Diffusion

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー