CDP: Towards Robust Autoregressive Visuomotor Policy Learning via Causal Diffusion

要約

拡散ポリシー（DP）により、ロボットはアクション拡散を通じて専門家のデモを模倣することにより、複雑な動作を学ぶことができます。
ただし、実際のアプリケーションでは、ハードウェアの制限はデータの品質を低下させることがよくありますが、リアルタイムの制約は瞬間的な状態およびシーンの観測に対するモデルの推論を制限します。
これらの制限により、専門家のデモンストレーションから学習の有効性が大幅に減少し、オブジェクトのローカリゼーション、把握計画、および長期タスクの実行の失敗が生じます。
これらの課題に対処するために、歴史的行動シーケンスを条件付けすることによりアクション予測を強化する新しい変圧器ベースの拡散モデルである因果拡散ポリシー（CDP）を提案し、それにより、よりコヒーレントでコンテキストを意識する視覚運動政策学習を可能にします。
自己回帰推論に関連する計算コストをさらに軽減するために、以前のタイムステップからの注意キー価値ペアを保存するためにキャッシュメカニズムも導入され、実行中の冗長計算を大幅に削減します。
多様な2D操作タスクと3D操作タスクにまたがるシミュレートされた環境と現実世界の両方の環境での広範な実験は、CDPが既存の方法よりも大幅に高い精度を達成するために履歴アクションシーケンスを独自に活用することを示しています。
さらに、劣化した入力観測品質に直面した場合でも、CDPは、現実的で不完全な条件下でのロボット制御の実用的な堅牢性を強調する、時間的連続性を推論することにより顕著な精度を維持します。

要約(オリジナル)

Diffusion Policy (DP) enables robots to learn complex behaviors by imitating expert demonstrations through action diffusion. However, in practical applications, hardware limitations often degrade data quality, while real-time constraints restrict model inference to instantaneous state and scene observations. These limitations seriously reduce the efficacy of learning from expert demonstrations, resulting in failures in object localization, grasp planning, and long-horizon task execution. To address these challenges, we propose Causal Diffusion Policy (CDP), a novel transformer-based diffusion model that enhances action prediction by conditioning on historical action sequences, thereby enabling more coherent and context-aware visuomotor policy learning. To further mitigate the computational cost associated with autoregressive inference, a caching mechanism is also introduced to store attention key-value pairs from previous timesteps, substantially reducing redundant computations during execution. Extensive experiments in both simulated and real-world environments, spanning diverse 2D and 3D manipulation tasks, demonstrate that CDP uniquely leverages historical action sequences to achieve significantly higher accuracy than existing methods. Moreover, even when faced with degraded input observation quality, CDP maintains remarkable precision by reasoning through temporal continuity, which highlights its practical robustness for robotic control under realistic, imperfect conditions.

arxiv情報

著者	Jiahua Ma,Yiran Qin,Yixiong Li,Xuanqi Liao,Yulan Guo,Ruimao Zhang
発行日	2025-06-17 17:59:12+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

CDP: Towards Robust Autoregressive Visuomotor Policy Learning via Causal Diffusion

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー