Synthesizing Moving People with 3D Control

要約

この論文では、特定のターゲット 3D モーションシーケンスに対して単一の画像から人物をアニメーション化するための拡散モデルベースのフレームワークを紹介します。
私たちのアプローチには 2 つの主要なコンポーネントがあります。a) 人体と衣服の目に見えない部分に関する事前学習、b) 適切な衣服とテクスチャを使用して新しい体のポーズをレンダリングします。
最初の部分では、単一の画像を与えられた人の目に見えない部分を幻覚させるための充填拡散モデルを学習します。
このモデルをテクスチャマップ空間でトレーニングします。これにより、ポーズと視点に対して不変であるため、サンプル効率が向上します。
次に、3D 人間のポーズによって制御される拡散ベースのレンダリングパイプラインを開発します。
これにより、衣服、髪、目に見えない領域の妥当な塗りつぶしなど、人物の新しいポーズのリアルなレンダリングが生成されます。
この解きほぐされたアプローチにより、私たちの方法は、3D ポーズでのターゲットの動きと、視覚的な類似性の点で入力画像に忠実な一連の画像を生成することができます。
それに加えて、3D コントロールにより、さまざまな合成カメラの軌跡で人物をレンダリングできます。
私たちの実験は、私たちの方法が従来の方法と比較して、長時間の動きやさまざまな挑戦的で複雑なポーズを生成する際に弾力性があることを示しています。
詳細については、Web サイト https://boyiliee.github.io/3DHM.github.io/ をご覧ください。

要約(オリジナル)

In this paper, we present a diffusion model-based framework for animating people from a single image for a given target 3D motion sequence. Our approach has two core components: a) learning priors about invisible parts of the human body and clothing, and b) rendering novel body poses with proper clothing and texture. For the first part, we learn an in-filling diffusion model to hallucinate unseen parts of a person given a single image. We train this model on texture map space, which makes it more sample-efficient since it is invariant to pose and viewpoint. Second, we develop a diffusion-based rendering pipeline, which is controlled by 3D human poses. This produces realistic renderings of novel poses of the person, including clothing, hair, and plausible in-filling of unseen regions. This disentangled approach allows our method to generate a sequence of images that are faithful to the target motion in the 3D pose and, to the input image in terms of visual similarity. In addition to that, the 3D control allows various synthetic camera trajectories to render a person. Our experiments show that our method is resilient in generating prolonged motions and varied challenging and complex poses compared to prior methods. Please check our website for more details: https://boyiliee.github.io/3DHM.github.io/.

arxiv情報

著者	Boyi Li,Jathushan Rajasegaran,Yossi Gandelsman,Alexei A. Efros,Jitendra Malik
発行日	2024-01-19 18:59:11+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

Synthesizing Moving People with 3D Control

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー