Motion Prompting: Controlling Video Generation with Motion Trajectories

要約

モーション制御は、表現力豊かで魅力的な映像コンテンツを生成するために極めて重要である。しかし、既存の映像生成モデルの多くは、制御を主にテキストプロンプトに依存しており、動的な動作や時間的構成のニュアンスを捉えるのに苦労している。このため、我々は、時空間的に疎な、あるいは密な動きの軌跡を条件とする動画生成モデルを学習する。その柔軟性から、この条件付けをモーションプロンプトと呼ぶ。ユーザはスパースな軌跡を直接指定することができるが、我々はまた、高レベルのユーザの要求を、詳細な、半密度のモーションプロンプトに変換する方法を示す。我々は、カメラやオブジェクトのモーションコントロール、画像との「対話」、モーション転送、画像編集を含む様々なアプリケーションを通して、我々のアプローチの汎用性を実証する。その結果、現実的な物理のような創発的な振る舞いを示し、映像モデルのプローブや将来の生成的な世界モデルとのインタラクションのためのモーションプロンプトの可能性を示唆する。最後に、定量的に評価し、人間による研究を行い、強力なパフォーマンスを実証する。ビデオ結果はウェブページでご覧いただけます： https://motion-prompting.github.io/

要約(オリジナル)

Motion control is crucial for generating expressive and compelling video content; however, most existing video generation models rely mainly on text prompts for control, which struggle to capture the nuances of dynamic actions and temporal compositions. To this end, we train a video generation model conditioned on spatio-temporally sparse or dense motion trajectories. In contrast to prior motion conditioning work, this flexible representation can encode any number of trajectories, object-specific or global scene motion, and temporally sparse motion; due to its flexibility we refer to this conditioning as motion prompts. While users may directly specify sparse trajectories, we also show how to translate high-level user requests into detailed, semi-dense motion prompts, a process we term motion prompt expansion. We demonstrate the versatility of our approach through various applications, including camera and object motion control, ‘interacting’ with an image, motion transfer, and image editing. Our results showcase emergent behaviors, such as realistic physics, suggesting the potential of motion prompts for probing video models and interacting with future generative world models. Finally, we evaluate quantitatively, conduct a human study, and demonstrate strong performance. Video results are available on our webpage: https://motion-prompting.github.io/

arxiv情報

著者	Daniel Geng,Charles Herrmann,Junhwa Hur,Forrester Cole,Serena Zhang,Tobias Pfaff,Tatiana Lopez-Guevara,Carl Doersch,Yusuf Aytar,Michael Rubinstein,Chen Sun,Oliver Wang,Andrew Owens,Deqing Sun
発行日	2024-12-03 18:59:56+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, DeepL

Motion Prompting: Controlling Video Generation with Motion Trajectories

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー