Free4D: Tuning-free 4D Scene Generation with Spatial-Temporal Consistency

要約

単一の画像からの4Dシーン生成のための新しいチューニングフリーのフレームワークであるFree4Dを提示します。
既存の方法は、オブジェクトレベルの生成に焦点を当て、シーンレベルの生成を実行不可能にするか、4Dシーンデータが不足しているため、一般化能力が限られている高価なトレーニングのために大規模なマルチビュービデオデータセットに依存しています。
対照的に、私たちの重要な洞察は、一貫した4Dシーン表現のために事前に訓練された基礎モデルを蒸留することです。これは、効率や一般化などの有望な利点を提供します。
1）これを達成するために、最初に画像間拡散モデルを使用して入力画像をアニメーション化し、それに続いて4D幾何学的構造の初期化を行います。
2）この粗い構造を空間的な一貫したマルチビュービデオに変えるために、空間的一貫性のための点誘導除去戦略と、時間的一貫性のための新規潜在的置換戦略を備えた適応ガイダンスメカニズムを設計します。
3）これらの生成された観察結果を一貫した4D表現に持ち上げるために、生成された情報を完全に活用しながら、矛盾を緩和するための変調ベースの改良を提案します。
結果として得られる4D表現により、リアルタイムで制御可能なレンダリングが可能になり、シングルイメージベースの4Dシーン生成が大幅に進歩します。

要約(オリジナル)

We present Free4D, a novel tuning-free framework for 4D scene generation from a single image. Existing methods either focus on object-level generation, making scene-level generation infeasible, or rely on large-scale multi-view video datasets for expensive training, with limited generalization ability due to the scarcity of 4D scene data. In contrast, our key insight is to distill pre-trained foundation models for consistent 4D scene representation, which offers promising advantages such as efficiency and generalizability. 1) To achieve this, we first animate the input image using image-to-video diffusion models followed by 4D geometric structure initialization. 2) To turn this coarse structure into spatial-temporal consistent multiview videos, we design an adaptive guidance mechanism with a point-guided denoising strategy for spatial consistency and a novel latent replacement strategy for temporal coherence. 3) To lift these generated observations into consistent 4D representation, we propose a modulation-based refinement to mitigate inconsistencies while fully leveraging the generated information. The resulting 4D representation enables real-time, controllable rendering, marking a significant advancement in single-image-based 4D scene generation.

arxiv情報

著者	Tianqi Liu,Zihao Huang,Zhaoxi Chen,Guangcong Wang,Shoukang Hu,Liao Shen,Huiqiang Sun,Zhiguo Cao,Wei Li,Ziwei Liu
発行日	2025-03-26 17:59:44+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

Free4D: Tuning-free 4D Scene Generation with Spatial-Temporal Consistency

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー