Diffusion Distillation With Direct Preference Optimization For Efficient 3D LiDAR Scene Completion

要約

3D LIDARシーンの完了における拡散モデルの適用は、拡散のサンプリング速度が遅いため、制限されています。
スコア蒸留は拡散サンプリングを加速しますが、パフォーマンスの低下とともに、直接ポリシー最適化（DPO）を使用したトレーニング後は、優先データを使用してパフォーマンスを高めます。
このペーパーでは、蒸留-DPOを提案します。これは、好みの整理を伴うLIDARシーンの完成のための新しい拡散蒸留フレームワークです。
まず、学生モデルは、異なる初期ノイズでペアの完了シーンを生成します。
第二に、Lidarシーンの評価メトリックを好みとして使用して、獲得とサンプルのペアを失います。
ほとんどのLIDARシーンメトリックは有益ですが、直接最適化されるためには有益ではないが、決定不可能であるため、このような構造は合理的です。
第三に、蒸留-DPOは、ペアの完了シーンで教師モデルと学生モデルの間のスコア関数の違いを活用することにより、学生モデルを最適化します。
このような手順は、収束するまで繰り返されます。
広範な実験では、最先端のLIDARシーンの完了拡散モデルと比較して、蒸留-DPOが高品質のシーンの完成を達成しながら、完了速度を5倍以上加速することが示されています。
私たちの方法は、私たちの知識を最大限に活用して蒸留において好みの学習を採用することを探求し、好みに合った蒸留に関する洞察を提供する最初の方法です。
私たちのコードは、https：//github.com/happyw1nd/distillationdpoで公開されています。

要約(オリジナル)

The application of diffusion models in 3D LiDAR scene completion is limited due to diffusion’s slow sampling speed. Score distillation accelerates diffusion sampling but with performance degradation, while post-training with direct policy optimization (DPO) boosts performance using preference data. This paper proposes Distillation-DPO, a novel diffusion distillation framework for LiDAR scene completion with preference aligment. First, the student model generates paired completion scenes with different initial noises. Second, using LiDAR scene evaluation metrics as preference, we construct winning and losing sample pairs. Such construction is reasonable, since most LiDAR scene metrics are informative but non-differentiable to be optimized directly. Third, Distillation-DPO optimizes the student model by exploiting the difference in score functions between the teacher and student models on the paired completion scenes. Such procedure is repeated until convergence. Extensive experiments demonstrate that, compared to state-of-the-art LiDAR scene completion diffusion models, Distillation-DPO achieves higher-quality scene completion while accelerating the completion speed by more than 5-fold. Our method is the first to explore adopting preference learning in distillation to the best of our knowledge and provide insights into preference-aligned distillation. Our code is public available on https://github.com/happyw1nd/DistillationDPO.

arxiv情報

著者	An Zhaol,Shengyuan Zhang,Ling Yang,Zejian Li,Jiale Wu,Haoran Xu,AnYang Wei,Perry Pengyun GU Lingyun Sun
発行日	2025-04-15 17:57:13+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

Diffusion Distillation With Direct Preference Optimization For Efficient 3D LiDAR Scene Completion

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー