NVComposer: Boosting Generative Novel View Synthesis with Multiple Sparse and Unposed Images

要約

近年の生成モデルの進歩により、マルチビューデータからの新規ビュー合成（NVS）が大幅に改善された。しかし、既存の手法は、明示的なポーズ推定や事前再構成などの外部多視点アライメント処理に依存しており、特にビュー間のオーバーラップやオクルージョンが不十分なためにアライメントが不安定な場合には、柔軟性やアクセス性が制限される。本論文では、明示的な外部アライメントを不要とする新しいアプローチであるNVComposerを提案する。NVComposerは、2つの重要なコンポーネントを導入することにより、生成モデルが複数の条件付きビュー間の空間的・幾何学的関係を暗黙的に推論することを可能にする：1）ターゲットとなる新しいビューと条件となるカメラのポーズを同時に生成する画像ポーズデュアルストリーム拡散モデル、2）学習中に密なステレオモデルから幾何学的プリオを抽出する幾何学認識特徴アライメントモジュール。広範な実験により、NVComposerが生成的マルチビューNVSタスクにおいて最先端の性能を達成し、外部アライメントへの依存を取り除き、モデルアクセシビリティを向上させることが実証された。我々のアプローチは、未ポーズの入力ビューの数が増加するにつれて、合成品質の大幅な向上を示し、より柔軟で利用しやすい生成的NVSシステムの可能性を強調している。

要約(オリジナル)

Recent advancements in generative models have significantly improved novel view synthesis (NVS) from multi-view data. However, existing methods depend on external multi-view alignment processes, such as explicit pose estimation or pre-reconstruction, which limits their flexibility and accessibility, especially when alignment is unstable due to insufficient overlap or occlusions between views. In this paper, we propose NVComposer, a novel approach that eliminates the need for explicit external alignment. NVComposer enables the generative model to implicitly infer spatial and geometric relationships between multiple conditional views by introducing two key components: 1) an image-pose dual-stream diffusion model that simultaneously generates target novel views and condition camera poses, and 2) a geometry-aware feature alignment module that distills geometric priors from dense stereo models during training. Extensive experiments demonstrate that NVComposer achieves state-of-the-art performance in generative multi-view NVS tasks, removing the reliance on external alignment and thus improving model accessibility. Our approach shows substantial improvements in synthesis quality as the number of unposed input views increases, highlighting its potential for more flexible and accessible generative NVS systems.

arxiv情報

著者	Lingen Li,Zhaoyang Zhang,Yaowei Li,Jiale Xu,Xiaoyu Li,Wenbo Hu,Weihao Cheng,Jinwei Gu,Tianfan Xue,Ying Shan
発行日	2024-12-04 17:58:03+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, DeepL

NVComposer: Boosting Generative Novel View Synthesis with Multiple Sparse and Unposed Images

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー