TransPixar: Advancing Text-to-Video Generation with Transparency

要約

テキストからビデオへの生成モデルは大幅な進歩を遂げ、エンターテイメント、広告、教育における多様なアプリケーションを可能にしました。
ただし、透明性のためのアルファチャネルを含む RGBA ビデオの生成は、データセットが限られていることと既存のモデルを適応させることが難しいため、依然として課題が残っています。
アルファチャネルは視覚効果 (VFX) にとって重要であり、煙や反射などの透明な要素をシーンにシームレスにブレンドできます。
オリジナルの RGB 機能を保持しながら、事前トレーニングされたビデオモデルを RGBA 生成用に拡張する方法である TransPixar を紹介します。
TransPixar は拡散トランスフォーマー (DiT) アーキテクチャを活用し、アルファ固有のトークンを組み込み、LoRA ベースの微調整を使用して、一貫性の高い RGB チャネルとアルファチャネルを共同生成します。
TransPixar は、アテンションメカニズムを最適化することで、元の RGB モデルの長所を維持し、トレーニングデータが限られているにもかかわらず、RGB チャネルとアルファチャネル間の強力な調整を実現します。
私たちのアプローチは、多様で一貫した RGBA ビデオを効果的に生成し、VFX やインタラクティブなコンテンツ作成の可能性を高めます。

要約(オリジナル)

Text-to-video generative models have made significant strides, enabling diverse applications in entertainment, advertising, and education. However, generating RGBA video, which includes alpha channels for transparency, remains a challenge due to limited datasets and the difficulty of adapting existing models. Alpha channels are crucial for visual effects (VFX), allowing transparent elements like smoke and reflections to blend seamlessly into scenes. We introduce TransPixar, a method to extend pretrained video models for RGBA generation while retaining the original RGB capabilities. TransPixar leverages a diffusion transformer (DiT) architecture, incorporating alpha-specific tokens and using LoRA-based fine-tuning to jointly generate RGB and alpha channels with high consistency. By optimizing attention mechanisms, TransPixar preserves the strengths of the original RGB model and achieves strong alignment between RGB and alpha channels despite limited training data. Our approach effectively generates diverse and consistent RGBA videos, advancing the possibilities for VFX and interactive content creation.

arxiv情報

著者	Luozhou Wang,Yijun Li,Zhifei Chen,Jui-Hsien Wang,Zhifei Zhang,He Zhang,Zhe Lin,Yingcong Chen
発行日	2025-01-06 13:32:16+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

TransPixar: Advancing Text-to-Video Generation with Transparency

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー