GAIA-2: A Controllable Multi-View Generative World Model for Autonomous Driving

要約

生成モデルは、複雑な環境をシミュレートするためのスケーラブルで柔軟なパラダイムを提供しますが、現在のアプローチは、マルチエージェントの相互作用、細粒コントロール、マルチカメラの一貫性など、自律運転のドメイン固有の要件に対処するのに不十分です。
単一の生成フレームワーク内でこれらの機能を統合する潜在的な拡散世界モデルである自律性の生成AI GAIA-2を紹介します。
GAIA-2は、エゴ車のダイナミクス、エージェント構成、環境要因、道路セマンティクスの豊富な構造化入力セットを条件付けられた制御可能なビデオ生成をサポートします。
地理的に多様な運転環境（英国、米国、ドイツ）にわたって、高解像度の空間的に一貫したマルチカメラビデオを生成します。
このモデルは、構造化された条件付けと外部潜在的な埋め込み（たとえば、独自の運転モデルから）の両方を統合して、柔軟で意味的に接地されたシーンの統合を促進します。
この統合を通じて、GAIA-2は、一般的なドライビングシナリオとレアの両方の運転シナリオのスケーラブルなシミュレーションを可能にし、自律システムの開発におけるコアツールとしての生成世界モデルの使用を進めます。
ビデオはhttps://wayve.ai/thinking/gaia-2で入手できます。

要約(オリジナル)

Generative models offer a scalable and flexible paradigm for simulating complex environments, yet current approaches fall short in addressing the domain-specific requirements of autonomous driving – such as multi-agent interactions, fine-grained control, and multi-camera consistency. We introduce GAIA-2, Generative AI for Autonomy, a latent diffusion world model that unifies these capabilities within a single generative framework. GAIA-2 supports controllable video generation conditioned on a rich set of structured inputs: ego-vehicle dynamics, agent configurations, environmental factors, and road semantics. It generates high-resolution, spatiotemporally consistent multi-camera videos across geographically diverse driving environments (UK, US, Germany). The model integrates both structured conditioning and external latent embeddings (e.g., from a proprietary driving model) to facilitate flexible and semantically grounded scene synthesis. Through this integration, GAIA-2 enables scalable simulation of both common and rare driving scenarios, advancing the use of generative world models as a core tool in the development of autonomous systems. Videos are available at https://wayve.ai/thinking/gaia-2.

arxiv情報

著者	Lloyd Russell,Anthony Hu,Lorenzo Bertoni,George Fedoseev,Jamie Shotton,Elahe Arani,Gianluca Corrado
発行日	2025-03-26 13:11:35+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

GAIA-2: A Controllable Multi-View Generative World Model for Autonomous Driving

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー