A Survey of Interactive Generative Video

要約

インタラクティブな生成ビデオ（IGV）は、さまざまなドメインにわたる高品質でインタラクティブなビデオコンテンツに対する需要の高まりに応じて、重要な技術として浮上しています。
このホワイトペーパーでは、IGVを生成機能を組み合わせて、コントロール信号とレスポンシブフィードバックを介してユーザーエンゲージメントを可能にするインタラクティブな機能と、多様な高品質のビデオコンテンツを生成するテクノロジーとして定義します。
IGVアプリケーションの現在の状況を調査し、3つの主要なドメインに焦点を当てています。1）ゲームでは、IGVが仮想世界での無限の探索を可能にします。
2）具体化されたAIでは、IGVは、動的に進化するシーンとのマルチモーダル相互作用のトレーニングエージェント向けの物理認識環境シンセサイザーとして機能します。
3）自律運転。IGVは、安全性の高いテストと検証のための閉ループシミュレーション機能を提供します。
将来の開発を導くために、理想的なIGVシステムを生成、制御、メモリ、ダイナミクス、インテリジェンスの5つの重要なモジュールに分解する包括的なフレームワークを提案します。
さらに、リアルタイム生成の達成、オープンドメイン制御の有効化、長期的な一貫性の維持、正確な物理学のシミュレーション、因果的推論の統合など、理想的なIGVシステムの各コンポーネントを実現する際の技術的課題と将来の方向を体系的に分析します。
この体系的な分析は、IGVの分野での将来の研究開発を促進し、最終的により洗練された実用的なアプリケーションに向けて技術を促進すると考えています。

要約(オリジナル)

Interactive Generative Video (IGV) has emerged as a crucial technology in response to the growing demand for high-quality, interactive video content across various domains. In this paper, we define IGV as a technology that combines generative capabilities to produce diverse high-quality video content with interactive features that enable user engagement through control signals and responsive feedback. We survey the current landscape of IGV applications, focusing on three major domains: 1) gaming, where IGV enables infinite exploration in virtual worlds; 2) embodied AI, where IGV serves as a physics-aware environment synthesizer for training agents in multimodal interaction with dynamically evolving scenes; and 3) autonomous driving, where IGV provides closed-loop simulation capabilities for safety-critical testing and validation. To guide future development, we propose a comprehensive framework that decomposes an ideal IGV system into five essential modules: Generation, Control, Memory, Dynamics, and Intelligence. Furthermore, we systematically analyze the technical challenges and future directions in realizing each component for an ideal IGV system, such as achieving real-time generation, enabling open-domain control, maintaining long-term coherence, simulating accurate physics, and integrating causal reasoning. We believe that this systematic analysis will facilitate future research and development in the field of IGV, ultimately advancing the technology toward more sophisticated and practical applications.

arxiv情報

著者	Jiwen Yu,Yiran Qin,Haoxuan Che,Quande Liu,Xintao Wang,Pengfei Wan,Di Zhang,Kun Gai,Hao Chen,Xihui Liu
発行日	2025-04-30 17:59:02+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

A Survey of Interactive Generative Video

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー