FlexWorld: Progressively Expanding 3D Scenes for Flexiable-View Synthesis

要約

3Dデータが不足しているため、単一の画像から360 {\ deg}回転やズームを含む柔軟な視聴3Dシーンを生成することは困難です。
この目的のために、2つの重要なコンポーネントで構成される新しいフレームワークであるFlexWorldを紹介します。（1）強力なビデオからビデオへの拡散モデルで、粗いシーンからレンダリングされた不完全な入力から高品質の新規ビュー画像を生成し、（2）完全な3Dシーンを構築するためのプログレッシブ拡張プロセス。
特に、高度な事前訓練を受けたビデオモデルと正確な深さを推定するトレーニングペアを活用すると、V2Vモデルは、大きなカメラポーズバリエーションの下で新しいビューを生成できます。
それに基づいて、FlexWorldは徐々に新しい3Dコンテンツを生成し、Geometry-Awareシーンフュージョンを通じてグローバルシーンに統合します。
広範な実験は、既存の最先端の方法と比較して、複数の一般的なメトリックとデータセットの下で優れた視覚品質を達成する、高品質の斬新なビュービデオと柔軟なビュー3Dシーンを生成する際のFlexWorldの有効性を示しています。
定性的には、FlexWorldが360 {\ deg}回転やズームなどの柔軟なビューで高忠実度のシーンを生成できることを強調しています。
プロジェクトページ：https：//ml-gsai.github.io/flexworld。

要約(オリジナル)

Generating flexible-view 3D scenes, including 360{\deg} rotation and zooming, from single images is challenging due to a lack of 3D data. To this end, we introduce FlexWorld, a novel framework consisting of two key components: (1) a strong video-to-video (V2V) diffusion model to generate high-quality novel view images from incomplete input rendered from a coarse scene, and (2) a progressive expansion process to construct a complete 3D scene. In particular, leveraging an advanced pre-trained video model and accurate depth-estimated training pairs, our V2V model can generate novel views under large camera pose variations. Building upon it, FlexWorld progressively generates new 3D content and integrates it into the global scene through geometry-aware scene fusion. Extensive experiments demonstrate the effectiveness of FlexWorld in generating high-quality novel view videos and flexible-view 3D scenes from single images, achieving superior visual quality under multiple popular metrics and datasets compared to existing state-of-the-art methods. Qualitatively, we highlight that FlexWorld can generate high-fidelity scenes with flexible views like 360{\deg} rotations and zooming. Project page: https://ml-gsai.github.io/FlexWorld.

arxiv情報

著者	Luxi Chen,Zihan Zhou,Min Zhao,Yikai Wang,Ge Zhang,Wenhao Huang,Hao Sun,Ji-Rong Wen,Chongxuan Li
発行日	2025-03-17 15:18:38+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

FlexWorld: Progressively Expanding 3D Scenes for Flexiable-View Synthesis

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー