UrbanGIRAFFE: Representing Urban Scenes as Compositional Generative Neural Feature Fields

要約

制御可能なカメラポーズとシーンコンテンツを使用してフォトリアリスティックな画像を生成することは、AR/VR やシミュレーションなどの多くのアプリケーションにとって不可欠です。
3D 対応の生成モデルが急速に進歩しているにもかかわらず、既存の方法のほとんどはオブジェクト中心の画像に焦点を当てており、自由なカメラの視点制御やシーン編集のための都市シーンの生成には適用できません。
この困難なタスクに対処するために、UrbanGIRAFFE を提案します。これは、3D 対応の生成モデルをガイドするために、数えられないものと数えられるオブジェクトのレイアウト分布を含む粗い 3D パノプティックプライアを使用します。
私たちのモデルは、シーンを物、オブジェクト、空に分解するため、構成的で制御可能です。
セマンティックボクセルグリッドの形式で事前にスタッフを使用して、粗いセマンティックおよびジオメトリ情報を効果的に組み込む条件付きスタッフジェネレーターを構築します。
オブジェクトレイアウトプリアにより、雑然としたシーンからオブジェクトジェネレーターを学習することができます。
適切な損失関数を使用すると、私たちのアプローチは、大きなカメラの動き、素材の編集、オブジェクトの操作など、さまざまな制御性を備えたフォトリアリスティックな 3D 対応の画像合成を容易にします。
挑戦的なKITTI-360データセットを含む、合成データセットと現実世界のデータセットの両方でモデルの有効性を検証します。

要約(オリジナル)

Generating photorealistic images with controllable camera pose and scene contents is essential for many applications including AR/VR and simulation. Despite the fact that rapid progress has been made in 3D-aware generative models, most existing methods focus on object-centric images and are not applicable to generating urban scenes for free camera viewpoint control and scene editing. To address this challenging task, we propose UrbanGIRAFFE, which uses a coarse 3D panoptic prior, including the layout distribution of uncountable stuff and countable objects, to guide a 3D-aware generative model. Our model is compositional and controllable as it breaks down the scene into stuff, objects, and sky. Using stuff prior in the form of semantic voxel grids, we build a conditioned stuff generator that effectively incorporates the coarse semantic and geometry information. The object layout prior further allows us to learn an object generator from cluttered scenes. With proper loss functions, our approach facilitates photorealistic 3D-aware image synthesis with diverse controllability, including large camera movement, stuff editing, and object manipulation. We validate the effectiveness of our model on both synthetic and real-world datasets, including the challenging KITTI-360 dataset.

arxiv情報

著者	Yuanbo Yang,Yifei Yang,Hanlei Guo,Rong Xiong,Yue Wang,Yiyi Liao
発行日	2023-03-24 17:28:07+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

UrbanGIRAFFE: Representing Urban Scenes as Compositional Generative Neural Feature Fields

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー