Joint Reconstruction of Spatially-Coherent and Realistic Clothed Humans and Objects from a Single Image

要約

人間の形学習における最近の進歩は、シングルビュー画像からの正確な人間の再構築を達成することに焦点を合わせてきました。
しかし、現実の世界では、人間は他のオブジェクトと空間を共有しています。
人間と物との画像の再構築は、閉塞と3D空間的認識の欠如のために挑戦的であり、再建の深さのあいまいさにつながります。
単眼のヒトオブジェクトの再構築における既存の方法は、テンプレートベースの性質のために、衣服を着た人体と物体表面の複雑な詳細をキャプチャできません。
この論文では、ヒトオブジェクトの閉塞に対処しながら、シングルビュー画像から空間的にコヒーレントな方法で服を着た人間とオブジェクトを共同で再構築します。
新しい注意ベースのニューラル暗黙モデルが提案されています。画像ピクセルアライメントをレバレッジして高品質の詳細を取得し、3D空間認識を可能にするためにヒトオブジェクトのポーズから抽出されたセマンティック機能を組み込みます。
生成拡散モデルは、ヒトオブジェクトの閉塞を処理するために使用されます。
トレーニングと評価のために、閉塞中の3Dヒトスキャンと多様なオブジェクトのレンダリングされたシーンを備えた合成データセットを導入します。
合成データセットと実際のデータセットの両方での広範な評価は、競争力のある方法よりも提案されている人間とオブジェクトの再構成の優れた品質を示しています。

要約(オリジナル)

Recent advances in human shape learning have focused on achieving accurate human reconstruction from single-view images. However, in the real world, humans share space with other objects. Reconstructing images with humans and objects is challenging due to the occlusions and lack of 3D spatial awareness, which leads to depth ambiguity in the reconstruction. Existing methods in monocular human-object reconstruction fail to capture intricate details of clothed human bodies and object surfaces due to their template-based nature. In this paper, we jointly reconstruct clothed humans and objects in a spatially coherent manner from single-view images, while addressing human-object occlusions. A novel attention-based neural implicit model is proposed that leverages image pixel alignment to retrieve high-quality details, and incorporates semantic features extracted from the human-object pose to enable 3D spatial awareness. A generative diffusion model is used to handle human-object occlusions. For training and evaluation, we introduce a synthetic dataset with rendered scenes of inter-occluded 3D human scans and diverse objects. Extensive evaluation on both synthetic and real datasets demonstrates the superior quality of proposed human-object reconstructions over competitive methods.

arxiv情報

著者	Ayushi Dutta,Marco Pesavento,Marco Volino,Adrian Hilton,Armin Mustafa
発行日	2025-02-25 12:26:36+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

Joint Reconstruction of Spatially-Coherent and Realistic Clothed Humans and Objects from a Single Image

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー