Guess The Unseen: Dynamic 3D Scene Reconstruction from Partial 2D Glimpses

要約

この論文では、単眼ビデオ入力から世界と複数の動的な人間を 3D で再構成する方法を紹介します。
重要なアイデアとして、最近登場した 3D ガウススプラッティング (3D-GS) 表現を介して世界と複数の人間の両方を表現し、それらを便利かつ効率的に合成して一緒にレンダリングできるようにします。
特に、現実世界で遭遇する一般的な課題である 3D 人間の再構成における、観測が非常に限られ、まばらなシナリオに対処します。
この課題に取り組むために、共通空間でまばらなキューを融合することで標準空間で 3D-GS 表現を最適化する新しいアプローチを導入します。そこでは、事前にトレーニングされた 2D 拡散モデルを利用して、見えないビューを合成し、その一方で一貫性を保ちます。
観察された 2D の外観。
私たちは、オクルージョン、画像のトリミング、少数のショット、および非常にまばらな観察の存在下で、さまざまな困難な例において、私たちの方法が高品質のアニメーション可能な 3D 人間を再構築できることを実証します。
再構成後、私たちの方法は、任意の時点で任意の新しいビューでシーンをレンダリングするだけでなく、個々の人間を削除したり、人間ごとに異なるモーションを適用したりして 3D シーンを編集することもできます。
さまざまな実験を通じて、代替の既存のアプローチと比較して、私たちの方法の品質と効率を実証します。

要約(オリジナル)

In this paper, we present a method to reconstruct the world and multiple dynamic humans in 3D from a monocular video input. As a key idea, we represent both the world and multiple humans via the recently emerging 3D Gaussian Splatting (3D-GS) representation, enabling to conveniently and efficiently compose and render them together. In particular, we address the scenarios with severely limited and sparse observations in 3D human reconstruction, a common challenge encountered in the real world. To tackle this challenge, we introduce a novel approach to optimize the 3D-GS representation in a canonical space by fusing the sparse cues in the common space, where we leverage a pre-trained 2D diffusion model to synthesize unseen views while keeping the consistency with the observed 2D appearances. We demonstrate our method can reconstruct high-quality animatable 3D humans in various challenging examples, in the presence of occlusion, image crops, few-shot, and extremely sparse observations. After reconstruction, our method is capable of not only rendering the scene in any novel views at arbitrary time instances, but also editing the 3D scene by removing individual humans or applying different motions for each human. Through various experiments, we demonstrate the quality and efficiency of our methods over alternative existing approaches.

arxiv情報

著者	Inhee Lee,Byungjun Kim,Hanbyul Joo
発行日	2024-04-22 17:59:50+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

Guess The Unseen: Dynamic 3D Scene Reconstruction from Partial 2D Glimpses

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー