Leveraging Temporal Contexts to Enhance Vehicle-Infrastructure Cooperative Perception

要約

高い位置に設置されたインフラストラクチャセンサーは、より広い知覚範囲を提供し、オクルージョンに遭遇することが少なくなります。
車両インフラ連携として知られる V2X 通信を介してインフラストラクチャと自車両データの両方を統合することは、認識能力を強化し、単一車両の自動運転で遭遇するコーナーケースに対処する上で大きな利点を示しています。
しかし、協力的な知覚は、通信帯域幅の制限や実質的な通信の中断など、依然として多くの課題に直面しています。
この論文では、協調的な 3D オブジェクト検出のための新しいフレームワークである CTCE を提案します。
このフレームワークは、時間的コンテキストを強化してクエリを送信し、実際の通信条件に対応するために送信効率とパフォーマンスのバランスを効果的に調整します。
さらに、パフォーマンスをさらに向上させるために、時間ガイド付き融合モジュールを提案します。
路側の時間的強化と車両側の時空間融合は、マルチレベルの時間的コンテキスト統合メカニズムを構成し、時間情報を最大限に活用してパフォーマンスを向上させます。
さらに、通信の中断により失われた路側のクエリを回復するために、動きを認識した再構築モジュールが導入されています。
V2X-Seq および V2X-Sim データセットの実験結果は、CTCE がベースライン QUEST を上回り、mAP でそれぞれ 3.8% と 1.3% の改善を達成したことを示しています。
通信中断条件下での実験により、通信中断に対する CTCE の堅牢性が検証されます。

要約(オリジナル)

Infrastructure sensors installed at elevated positions offer a broader perception range and encounter fewer occlusions. Integrating both infrastructure and ego-vehicle data through V2X communication, known as vehicle-infrastructure cooperation, has shown considerable advantages in enhancing perception capabilities and addressing corner cases encountered in single-vehicle autonomous driving. However, cooperative perception still faces numerous challenges, including limited communication bandwidth and practical communication interruptions. In this paper, we propose CTCE, a novel framework for cooperative 3D object detection. This framework transmits queries with temporal contexts enhancement, effectively balancing transmission efficiency and performance to accommodate real-world communication conditions. Additionally, we propose a temporal-guided fusion module to further improve performance. The roadside temporal enhancement and vehicle-side spatial-temporal fusion together constitute a multi-level temporal contexts integration mechanism, fully leveraging temporal information to enhance performance. Furthermore, a motion-aware reconstruction module is introduced to recover lost roadside queries due to communication interruptions. Experimental results on V2X-Seq and V2X-Sim datasets demonstrate that CTCE outperforms the baseline QUEST, achieving improvements of 3.8% and 1.3% in mAP, respectively. Experiments under communication interruption conditions validate CTCE’s robustness to communication interruptions.

arxiv情報

著者	Jiaru Zhong,Haibao Yu,Tianyi Zhu,Jiahui Xu,Wenxian Yang,Zaiqing Nie,Chao Sun
発行日	2024-08-20 04:16:13+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

Leveraging Temporal Contexts to Enhance Vehicle-Infrastructure Cooperative Perception

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー