360DVD: Controllable Panorama Video Generation with 360-Degree Video Diffusion Model

要約

360 度パノラマビデオは、没入感の向上により、研究とアプリケーションの両方で最近関心を集めています。
360 度のパノラマビデオの撮影には高価なコストがかかるため、指定されたプロンプトに従って望ましいパノラマビデオを生成することが緊急に必要とされています。
最近、新たなテキストからビデオへの (T2V) 拡散方法が、標準的なビデオ生成において顕著な効果を示しています。
ただし、パノラマビデオと標準ビデオではコンテンツとモーションパターンに大きな違いがあるため、これらの方法では満足のいく 360 度パノラマビデオを作成する際に課題が発生します。
この論文では、与えられたプロンプトとモーション条件に基づいてパノラマビデオを生成するための、360 度ビデオ拡散モデル (360DVD) と呼ばれる制御可能なパノラマビデオ生成パイプラインを提案します。
具体的には、360 アダプターと呼ばれる軽量モジュールと、事前トレーニングされた T2V モデルを 360 度ビデオ生成用に変換する支援された 360 拡張テクニックを導入します。
さらに、360DVD をトレーニングするための 360 度のビデオとテキストのペアで構成される WEB360 という名前の新しいパノラマデータセットを提案し、キャプション付きパノラマビデオデータセットの欠如に対処します。
広範な実験により、パノラマビデオ生成における 360DVD の優位性と有効性が実証されています。
コードとデータセットは間もなくリリースされる予定です。

要約(オリジナル)

360-degree panoramic videos recently attract more interest in both studies and applications, courtesy of the heightened immersive experiences they engender. Due to the expensive cost of capturing 360-degree panoramic videos, generating desirable panoramic videos by given prompts is urgently required. Recently, the emerging text-to-video (T2V) diffusion methods demonstrate notable effectiveness in standard video generation. However, due to the significant gap in content and motion patterns between panoramic and standard videos, these methods encounter challenges in yielding satisfactory 360-degree panoramic videos. In this paper, we propose a controllable panorama video generation pipeline named 360-Degree Video Diffusion model (360DVD) for generating panoramic videos based on the given prompts and motion conditions. Concretely, we introduce a lightweight module dubbed 360-Adapter and assisted 360 Enhancement Techniques to transform pre-trained T2V models for 360-degree video generation. We further propose a new panorama dataset named WEB360 consisting of 360-degree video-text pairs for training 360DVD, addressing the absence of captioned panoramic video datasets. Extensive experiments demonstrate the superiority and effectiveness of 360DVD for panorama video generation. The code and dataset will be released soon.

arxiv情報

著者	Qian Wang,Weiqi Li,Chong Mou,Xinhua Cheng,Jian Zhang
発行日	2024-01-12 13:52:29+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

360DVD: Controllable Panorama Video Generation with 360-Degree Video Diffusion Model

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー