StyleMaster: Stylize Your Video with Artistic Generation and Translation

要約

スタイルコントロールは、ビデオ生成モデルでよく使われてきました。
既存の方法では、指定されたスタイルからかけ離れたビデオが生成され、コンテンツ漏洩が発生し、1 つのビデオを目的のスタイルに転送するのに苦労することがよくあります。
最初の観察は、既存の方法ではグローバルスタイルを強調するが、ローカルテクスチャは無視するのに対し、スタイル抽出段階が重要であるということです。
コンテンツの漏洩を防ぎながらテクスチャ機能を実現するために、プロンプトパッチの類似性に基づいてスタイルパッチを保持しながら、コンテンツ関連のパッチをフィルタリングします。
グローバルスタイル抽出では、モデル錯視を通じてペアのスタイルデータセットを生成し、対比学習を促進します。これにより、絶対的なスタイルの一貫性が大幅に向上します。
さらに、画像とビデオのギャップを埋めるために、静止ビデオ上で軽量のモーションアダプターをトレーニングします。これにより、暗黙的にスタイル化の範囲が強化され、画像でトレーニングされたモデルをビデオにシームレスに適用できるようになります。
これらの取り組みの恩恵を受けて、私たちのアプローチである StyleMaster は、スタイルの類似性と時間的一貫性の両方で大幅な改善を達成するだけでなく、グレーのタイル ControlNet を使用したビデオスタイル転送に簡単に一般化することもできます。
広範な実験と視覚化により、StyleMaster が競合他社を大幅に上回り、テキストコンテンツと一致し、参照画像のスタイルによく似た高品質の様式化されたビデオを効果的に生成できることが実証されました。
私たちのプロジェクトページは https://zixuan-ye.github.io/stylemaster にあります。

要約(オリジナル)

Style control has been popular in video generation models. Existing methods often generate videos far from the given style, cause content leakage, and struggle to transfer one video to the desired style. Our first observation is that the style extraction stage matters, whereas existing methods emphasize global style but ignore local textures. In order to bring texture features while preventing content leakage, we filter content-related patches while retaining style ones based on prompt-patch similarity; for global style extraction, we generate a paired style dataset through model illusion to facilitate contrastive learning, which greatly enhances the absolute style consistency. Moreover, to fill in the image-to-video gap, we train a lightweight motion adapter on still videos, which implicitly enhances stylization extent, and enables our image-trained model to be seamlessly applied to videos. Benefited from these efforts, our approach, StyleMaster, not only achieves significant improvement in both style resemblance and temporal coherence, but also can easily generalize to video style transfer with a gray tile ControlNet. Extensive experiments and visualizations demonstrate that StyleMaster significantly outperforms competitors, effectively generating high-quality stylized videos that align with textual content and closely resemble the style of reference images. Our project page is at https://zixuan-ye.github.io/stylemaster

arxiv情報

著者	Zixuan Ye,Huijuan Huang,Xintao Wang,Pengfei Wan,Di Zhang,Wenhan Luo
発行日	2024-12-10 18:44:08+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

StyleMaster: Stylize Your Video with Artistic Generation and Translation

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー