Free-Viewpoint RGB-D Human Performance Capture and Rendering

要約

斬新な視点からフォトリアリスティックな人間をキャプチャして忠実にレンダリングすることは、AR/VR アプリケーションの基本的な問題です。
以前の研究では、実験室の設定で印象的なパフォーマンスキャプチャの結果が示されましたが、特に顔の表情、手、衣服など、目に見えないアイデンティティを高い忠実度でカジュアルな自由視点の人間のキャプチャとレンダリングを実現することは自明ではありません。
これらの課題に取り組むために、低コストの深度カメラと同様に、アクター固有のモデルを使用せずに、シングルビューおよびスパース RGB-D センサーからキャプチャされた人間の目に見えないビューからリアルなレンダリングを生成する、新しいビュー合成フレームワークを導入します。
球ベースのニューラルレンダリングによって得られた新しいビューで高密度の特徴マップを作成し、グローバルコンテキスト修復モデルを使用して完全なレンダリングを作成するアーキテクチャを提案します。
さらに、エンハンサーネットワークは、元のビューから遮られた領域であっても、全体的な忠実度を活用し、細かいディテールで鮮明なレンダリングを生成します。
私たちの方法が、単一ストリームのまばらな RGB-D 入力が与えられた場合、合成および実際の人間の俳優の高品質の斬新なビューを生成することを示します。
目に見えないアイデンティティや新しいポーズに一般化し、表情を忠実に再構築します。
私たちのアプローチは、以前のビュー合成方法よりも優れており、さまざまなレベルの深度スパース性に対して堅牢です。

要約(オリジナル)

Capturing and faithfully rendering photo-realistic humans from novel views is a fundamental problem for AR/VR applications. While prior work has shown impressive performance capture results in laboratory settings, it is non-trivial to achieve casual free-viewpoint human capture and rendering for unseen identities with high fidelity, especially for facial expressions, hands, and clothes. To tackle these challenges we introduce a novel view synthesis framework that generates realistic renders from unseen views of any human captured from a single-view and sparse RGB-D sensor, similar to a low-cost depth camera, and without actor-specific models. We propose an architecture to create dense feature maps in novel views obtained by sphere-based neural rendering, and create complete renders using a global context inpainting model. Additionally, an enhancer network leverages the overall fidelity, even in occluded areas from the original view, producing crisp renders with fine details. We show that our method generates high-quality novel views of synthetic and real human actors given a single-stream, sparse RGB-D input. It generalizes to unseen identities, and new poses and faithfully reconstructs facial expressions. Our approach outperforms prior view synthesis methods and is robust to different levels of depth sparsity.

arxiv情報

著者	Phong Nguyen-Ha,Nikolaos Sarafianos,Christoph Lassner,Janne Heikkila,Tony Tung
発行日	2022-08-02 10:58:01+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

Free-Viewpoint RGB-D Human Performance Capture and Rendering

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー