DriveAgent: Multi-Agent Structured Reasoning with LLM and Multimodal Sensor Fusion for Autonomous Driving

要約

大規模な言語モデル（LLM）の推論とマルチモーダルセンサー融合を組み合わせて、状況的理解と意思決定を強化する新しいマルチエージェント自律運転フレームワークであるDriveAgentを紹介します。
Driveagentは、専門エージェント全体で構成されたLLM駆動型の分析プロセスを含むカメラ、LIDAR、GPS、およびIMUとIMUを含む多様なセンサーモダリティを独自に統合します。
フレームワークは、4つの主要なモジュールで構成されるモジュラーエージェントベースのパイプラインを介して動作します。（i）フィルター処理されたタイムスタンプに基づいた重要なセンサーデータイベントを識別する記述分析エージェント、（ii）車両の状態と動きを共同で評価するLIDARおよび視力エージェントが実施する専用の車両レベル分析、（III）環境合理と原因分析の環境分析と（III）agents agents agents and（and and and and and and and and and）
緊急に意識した意思決定エージェントは、洞察を優先し、タイムリーな操作を提案します。
このモジュール設計により、LLMは特殊な認識と推論エージェントを効果的に調整し、複雑な自律運転シナリオに関するまとまりのある解釈可能な洞察を提供します。
挑戦的な自律運転データセットに関する広範な実験は、DriveAgentがベースライン方法に対する複数のメトリックで優れたパフォーマンスを達成していることを示しています。
これらの結果は、提案されたLLM駆動型マルチエージェントセンサー融合フレームワークの有効性を検証し、自律駆動システムの堅牢性と信頼性を大幅に向上させる可能性を強調しています。

要約(オリジナル)

We introduce DriveAgent, a novel multi-agent autonomous driving framework that leverages large language model (LLM) reasoning combined with multimodal sensor fusion to enhance situational understanding and decision-making. DriveAgent uniquely integrates diverse sensor modalities-including camera, LiDAR, GPS, and IMU-with LLM-driven analytical processes structured across specialized agents. The framework operates through a modular agent-based pipeline comprising four principal modules: (i) a descriptive analysis agent identifying critical sensor data events based on filtered timestamps, (ii) dedicated vehicle-level analysis conducted by LiDAR and vision agents that collaboratively assess vehicle conditions and movements, (iii) environmental reasoning and causal analysis agents explaining contextual changes and their underlying mechanisms, and (iv) an urgency-aware decision-generation agent prioritizing insights and proposing timely maneuvers. This modular design empowers the LLM to effectively coordinate specialized perception and reasoning agents, delivering cohesive, interpretable insights into complex autonomous driving scenarios. Extensive experiments on challenging autonomous driving datasets demonstrate that DriveAgent is achieving superior performance on multiple metrics against baseline methods. These results validate the efficacy of the proposed LLM-driven multi-agent sensor fusion framework, underscoring its potential to substantially enhance the robustness and reliability of autonomous driving systems.

arxiv情報

著者	Xinmeng Hou,Wuqi Wang,Long Yang,Hao Lin,Jinglun Feng,Haigen Min,Xiangmo Zhao
発行日	2025-05-04 14:13:00+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

DriveAgent: Multi-Agent Structured Reasoning with LLM and Multimodal Sensor Fusion for Autonomous Driving

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー