Prior Does Matter: Visual Navigation via Denoising Diffusion Bridge Models

要約

マルチモーダル分布とトレーニングの安定性のモデリングで印象的なパフォーマンスを示す拡散ベースの模倣学習の最近の進歩は、さまざまなロボット学習タスクの大きな進歩をもたらしました。
視覚的なナビゲーションでは、以前の拡散ベースのポリシーは通常、ガウスノイズを除去することから開始することによりアクションシーケンスを生成します。
ただし、ターゲットアクション分布は、多くの場合、ガウスノイズから大きく分岐し、冗長化された除去ステップと学習の複雑さの向上につながります。
さらに、効果的なアクション分布のスパース性により、ポリシーがガイダンスなしで正確なアクションを生成することが困難になります。
これらの問題に対処するために、Navibridgerという名前の拡散ブリッジモデルを活用する斬新で統一された視覚ナビゲーションフレームワークを提案します。
このアプローチは、有益な事前のアクションから開始し、除去プロセスのガイダンスと効率を高めることにより、アクション生成を可能にします。
拡散橋が視覚ナビゲーションタスクでの模倣学習を強化し、以前のアクションを生成するための3つのソースポリシーをさらに調べることができる方法を探ります。
シミュレートされた屋内および現実世界の両方の屋内および屋外シナリオの両方の広範な実験は、ナビブリッジがポリシーの推論を加速し、ターゲットアクションシーケンスを生成するベースラインを上回ることを示しています。
コードはhttps://github.com/hren20/naividgerで入手できます。

要約(オリジナル)

Recent advancements in diffusion-based imitation learning, which show impressive performance in modeling multimodal distributions and training stability, have led to substantial progress in various robot learning tasks. In visual navigation, previous diffusion-based policies typically generate action sequences by initiating from denoising Gaussian noise. However, the target action distribution often diverges significantly from Gaussian noise, leading to redundant denoising steps and increased learning complexity. Additionally, the sparsity of effective action distributions makes it challenging for the policy to generate accurate actions without guidance. To address these issues, we propose a novel, unified visual navigation framework leveraging the denoising diffusion bridge models named NaviBridger. This approach enables action generation by initiating from any informative prior actions, enhancing guidance and efficiency in the denoising process. We explore how diffusion bridges can enhance imitation learning in visual navigation tasks and further examine three source policies for generating prior actions. Extensive experiments in both simulated and real-world indoor and outdoor scenarios demonstrate that NaviBridger accelerates policy inference and outperforms the baselines in generating target action sequences. Code is available at https://github.com/hren20/NaiviBridger.

arxiv情報

著者	Hao Ren,Yiming Zeng,Zetong Bi,Zhaoliang Wan,Junlong Huang,Hui Cheng
発行日	2025-04-14 09:42:35+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

Prior Does Matter: Visual Navigation via Denoising Diffusion Bridge Models

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー