SeedVR2: One-Step Video Restoration via Diffusion Adversarial Post-Training

要約

拡散ベースのビデオ修復（VR）の最近の進歩は、視覚品質の大幅な改善を示していますが、推論中に法外な計算コストをもたらします。
いくつかの蒸留ベースのアプローチは、ワンステップの画像修復の可能性を示していますが、特に現実世界の設定で高解像度のビデオを扱う場合、VRへの既存のアプローチを拡大することは依然として挑戦的で未定です。
この作業では、実際のデータに対して敵対的なVRトレーニングを実行するSeedVR2と呼ばれる1段階の拡散ベースのVRモデルを提案します。
挑戦的な高解像度VRを1つのステップ内で処理するために、モデルアーキテクチャとトレーニング手順の両方にいくつかの拡張機能を紹介します。
具体的には、適応型ウィンドウの注意メカニズムが提案されています。ここでは、ウィンドウサイズが出力分解能に適合するように動的に調整され、事前定義されたウィンドウサイズでウィンドウの注意を使用して高解像度VRで観察されるウィンドウの不一致を回避します。
VRに対する敵対的なトレーニング後のトレーニングを安定させ、改善するために、トレーニング効率を大幅に犠牲にすることなく、損失を一致させる提案された機能を含む一連の損失の有効性をさらに検証します。
広範な実験では、SeedVR2が既存のVRアプローチと比較して、単一のステップで同等のパフォーマンスまたはさらに優れたパフォーマンスを達成できることが示されています。

要約(オリジナル)

Recent advances in diffusion-based video restoration (VR) demonstrate significant improvement in visual quality, yet yield a prohibitive computational cost during inference. While several distillation-based approaches have exhibited the potential of one-step image restoration, extending existing approaches to VR remains challenging and underexplored, particularly when dealing with high-resolution video in real-world settings. In this work, we propose a one-step diffusion-based VR model, termed as SeedVR2, which performs adversarial VR training against real data. To handle the challenging high-resolution VR within a single step, we introduce several enhancements to both model architecture and training procedures. Specifically, an adaptive window attention mechanism is proposed, where the window size is dynamically adjusted to fit the output resolutions, avoiding window inconsistency observed under high-resolution VR using window attention with a predefined window size. To stabilize and improve the adversarial post-training towards VR, we further verify the effectiveness of a series of losses, including a proposed feature matching loss without significantly sacrificing training efficiency. Extensive experiments show that SeedVR2 can achieve comparable or even better performance compared with existing VR approaches in a single step.

arxiv情報

著者	Jianyi Wang,Shanchuan Lin,Zhijie Lin,Yuxi Ren,Meng Wei,Zongsheng Yue,Shangchen Zhou,Hao Chen,Yang Zhao,Ceyuan Yang,Xuefeng Xiao,Chen Change Loy,Lu Jiang
発行日	2025-06-05 17:51:05+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

SeedVR2: One-Step Video Restoration via Diffusion Adversarial Post-Training

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー