MAGI-1: Autoregressive Video Generation at Scale

要約

連続したフレームの固定長セグメントとして定義された一連のビデオチャンクを自動化することにより、ビデオを生成する世界モデルであるMAGI-1を提示します。
MAGI-1は、長期にわたって単調に増加するチャンクごとのノイズを除去し、因果的な時間モデリングを可能にし、ストリーミング生成を自然にサポートします。
テキスト命令を条件付けられた画像間（I2V）タスクで強力なパフォーマンスを実現し、いくつかのアルゴリズムの革新と専用のインフラストラクチャスタックによって可能になった高い時間的一貫性とスケーラビリティを提供します。
MAGI-1は、ビデオの長さに関係なく、一定のピーク推論コストを維持することにより、チャンクごとのプロンプトを介して制御可能な生成を促進し、リアルタイムのメモリ効率の高い展開をサポートします。
MAGI-1の最大のバリアントは、240億のパラメーターで構成され、最大400万トークンのコンテキストの長さをサポートし、アプローチのスケーラビリティと堅牢性を示しています。
コードとモデルは、https：//github.com/sandai-org/magi-1およびhttps://github.com/sandai-org/magiattentionで入手できます。
製品はhttps://sand.aiでアクセスできます。

要約(オリジナル)

We present MAGI-1, a world model that generates videos by autoregressively predicting a sequence of video chunks, defined as fixed-length segments of consecutive frames. Trained to denoise per-chunk noise that increases monotonically over time, MAGI-1 enables causal temporal modeling and naturally supports streaming generation. It achieves strong performance on image-to-video (I2V) tasks conditioned on text instructions, providing high temporal consistency and scalability, which are made possible by several algorithmic innovations and a dedicated infrastructure stack. MAGI-1 facilitates controllable generation via chunk-wise prompting and supports real-time, memory-efficient deployment by maintaining constant peak inference cost, regardless of video length. The largest variant of MAGI-1 comprises 24 billion parameters and supports context lengths of up to 4 million tokens, demonstrating the scalability and robustness of our approach. The code and models are available at https://github.com/SandAI-org/MAGI-1 and https://github.com/SandAI-org/MagiAttention. The product can be accessed at https://sand.ai.

arxiv情報

著者	Sand. ai,Hansi Teng,Hongyu Jia,Lei Sun,Lingzhi Li,Maolin Li,Mingqiu Tang,Shuai Han,Tianning Zhang,W. Q. Zhang,Weifeng Luo,Xiaoyang Kang,Yuchen Sun,Yue Cao,Yunpeng Huang,Yutong Lin,Yuxin Fang,Zewei Tao,Zheng Zhang,Zhongshu Wang,Zixun Liu,Dai Shi,Guoli Su,Hanwen Sun,Hong Pan,Jie Wang,Jiexin Sheng,Min Cui,Min Hu,Ming Yan,Shucheng Yin,Siran Zhang,Tingting Liu,Xianping Yin,Xiaoyu Yang,Xin Song,Xuan Hu,Yankai Zhang,Yuqiao Li
発行日	2025-05-19 14:58:50+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

MAGI-1: Autoregressive Video Generation at Scale

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー