HumanRAM: Feed-forward Human Reconstruction and Animation Model using Transformers

要約

人間の3D再構成とアニメーションは、コンピュータグラフィックスとビジョンにおける長年のテーマである。しかし、既存の手法は一般的に、精巧な密視野キャプチャや、時間のかかる被験者ごとの最適化手順に依存している。これらの限界に対処するために、我々は、単眼または疎な人物画像からの一般化可能な人物再構成とアニメーションのための新しいフィードフォワードアプローチであるHumanRAMを提案する。本アプローチでは、SMPL-Xニューラルテクスチャによってパラメータ化された明示的なポーズ条件を、変換器ベースの大規模再構成モデル（LRM）に導入することで、人間の再構成とアニメーションを統一的なフレームワークに統合する。関連するカメラパラメータとSMPL-Xポーズを持つ単眼または疎な入力画像が与えられたとき、我々のモデルはスケーラブルな変換器とDPTベースのデコーダを用い、新しい視点と新しいポーズの下でリアルな人間のレンダリングを合成する。明示的なポーズ条件を活用することで、我々のモデルは高品質な人間の再構成と、忠実度の高いポーズ制御アニメーションを同時に可能にする。実験によると、HumanRAMは、実世界のデータセットにおいて、再構成精度、アニメーションの忠実度、および汎化性能の点で従来の手法を大幅に上回っています。ビデオ結果はhttps://zju3dv.github.io/humanram/。

要約(オリジナル)

3D human reconstruction and animation are long-standing topics in computer graphics and vision. However, existing methods typically rely on sophisticated dense-view capture and/or time-consuming per-subject optimization procedures. To address these limitations, we propose HumanRAM, a novel feed-forward approach for generalizable human reconstruction and animation from monocular or sparse human images. Our approach integrates human reconstruction and animation into a unified framework by introducing explicit pose conditions, parameterized by a shared SMPL-X neural texture, into transformer-based large reconstruction models (LRM). Given monocular or sparse input images with associated camera parameters and SMPL-X poses, our model employs scalable transformers and a DPT-based decoder to synthesize realistic human renderings under novel viewpoints and novel poses. By leveraging the explicit pose conditions, our model simultaneously enables high-quality human reconstruction and high-fidelity pose-controlled animation. Experiments show that HumanRAM significantly surpasses previous methods in terms of reconstruction accuracy, animation fidelity, and generalization performance on real-world datasets. Video results are available at https://zju3dv.github.io/humanram/.

arxiv情報

著者	Zhiyuan Yu,Zhe Li,Hujun Bao,Can Yang,Xiaowei Zhou
発行日	2025-06-03 17:50:05+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, DeepL

HumanRAM: Feed-forward Human Reconstruction and Animation Model using Transformers

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー