Move-in-2D: 2D-Conditioned Human Motion Generation

要約

リアルな人間のビデオを生成することは依然として困難な作業であり、現在最も効果的な方法は人間のモーションシーケンスを制御信号として利用しています。
既存のアプローチでは、他のビデオから抽出された既存のモーションが使用されることが多く、アプリケーションが特定のモーションタイプとグローバルシーンマッチングに制限されます。
私たちは、シーン画像に基づいて人間の動きシーケンスを生成し、さまざまなシーンに適応する多様な動きを可能にする新しいアプローチである Move-in-2D を提案します。
私たちのアプローチでは、シーンの画像とテキストプロンプトの両方を入力として受け入れる拡散モデルを利用し、シーンに合わせたモーションシーケンスを生成します。
このモデルをトレーニングするために、1 人の人間の活動を特徴とする大規模なビデオデータセットを収集し、各ビデオに対応する人間の動きをターゲット出力として注釈付けします。
実験により、私たちの方法が投影後のシーン画像と一致する人間の動きを効果的に予測することが実証されました。
さらに、生成されたモーションシーケンスにより、ビデオ合成タスクにおける人間のモーションの品質が向上することを示します。

要約(オリジナル)

Generating realistic human videos remains a challenging task, with the most effective methods currently relying on a human motion sequence as a control signal. Existing approaches often use existing motion extracted from other videos, which restricts applications to specific motion types and global scene matching. We propose Move-in-2D, a novel approach to generate human motion sequences conditioned on a scene image, allowing for diverse motion that adapts to different scenes. Our approach utilizes a diffusion model that accepts both a scene image and text prompt as inputs, producing a motion sequence tailored to the scene. To train this model, we collect a large-scale video dataset featuring single-human activities, annotating each video with the corresponding human motion as the target output. Experiments demonstrate that our method effectively predicts human motion that aligns with the scene image after projection. Furthermore, we show that the generated motion sequence improves human motion quality in video synthesis tasks.

arxiv情報

著者	Hsin-Ping Huang,Yang Zhou,Jui-Hsien Wang,Difan Liu,Feng Liu,Ming-Hsuan Yang,Zhan Xu
発行日	2024-12-17 18:58:07+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

Move-in-2D: 2D-Conditioned Human Motion Generation

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー