Zero-Shot Video Editing through Adaptive Sliding Score Distillation

要約

Text-to-Video 生成 (T2V) の急速に進化する分野により、制御可能なビデオ編集研究に対する新たな関心が高まっています。
画像編集の進歩を反映して、拡散モデルのノイズ除去をガイドするための編集プロンプトの適用が注目を集めていますが、このノイズベースの推論プロセスは本質的に元のビデオの完全性を損ない、その結果、意図しない過剰編集や時間的不連続が生じます。
これらの課題に対処するために、この研究では、オリジナルのビデオコンテンツの直接操作を容易にするビデオベースのスコア抽出の新しいパラダイムを提案します。
具体的には、画像ベースのスコア抽出と区別して、グローバルおよびローカルの両方のビデオガイダンスを組み込んで編集エラーの影響を軽減する適応型スライディングスコア抽出戦略を提案します。
私たちが提案する画像ベースのジョイントガイダンスメカニズムと組み合わせることで、T2V モデルとシングルステップサンプリングに固有の不安定性を軽減する機能があります。
さらに、元のビデオの主要な特徴をさらに保持し、過剰編集を回避するために、Weighted Attendant Fusion モジュールを設計しています。
広範な実験により、これらの戦略が既存の課題に効果的に対処し、現在の最先端の方法と比較して優れたパフォーマンスを達成できることが実証されています。

要約(オリジナル)

The rapidly evolving field of Text-to-Video generation (T2V) has catalyzed renewed interest in controllable video editing research. While the application of editing prompts to guide diffusion model denoising has gained prominence, mirroring advancements in image editing, this noise-based inference process inherently compromises the original video’s integrity, resulting in unintended over-editing and temporal discontinuities. To address these challenges, this study proposes a novel paradigm of video-based score distillation, facilitating direct manipulation of original video content. Specifically, distinguishing it from image-based score distillation, we propose an Adaptive Sliding Score Distillation strategy, which incorporates both global and local video guidance to reduce the impact of editing errors. Combined with our proposed Image-based Joint Guidance mechanism, it has the ability to mitigate the inherent instability of the T2V model and single-step sampling. Additionally, we design a Weighted Attention Fusion module to further preserve the key features of the original video and avoid over-editing. Extensive experiments demonstrate that these strategies effectively address existing challenges, achieving superior performance compared to current state-of-the-art methods.

arxiv情報

著者	Lianghan Zhu,Yanqi Bao,Jing Huo,Jing Wu,Yu-Kun Lai,Wenbin Li,Yang Gao
発行日	2024-09-06 14:55:48+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

Zero-Shot Video Editing through Adaptive Sliding Score Distillation

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー