Benchmarking and Improving Bird’s Eye View Perception Robustness in Autonomous Driving

要約

最近の鳥瞰図（BEV）表現の進歩は、車載3D知覚に著しい可能性を示している。しかし、これらの手法は、標準的なベンチマークでは素晴らしい結果を達成しているものの、様々な条件下での頑健性は十分に評価されていない。本研究では、BEVアルゴリズムの回復力を評価するために設計された広範なベンチマーク群であるRoboBEVを紹介する。このベンチマークスイートには、3つの重大度レベルにわたって検証された、多様なカメラ破損タイプが含まれています。我々のベンチマークは、マルチモーダルモデルを使用する際に発生する完全なセンサ故障の影響も考慮している。RoboBEVを通して、検出、マップ分割、奥行き推定、占有予測などのタスクにまたがる33の最先端のBEVベースの知覚モデルを評価する。我々の分析により、分布内データセットにおけるモデルの性能と分布外の課題に対する耐性との間に顕著な相関関係があることが明らかになった。また、我々の実験結果は、事前学習や奥行きのないBEV変換のような戦略が、分布外データに対する頑健性を強化する上で有効であることを強調している。さらに、広範な時間情報を活用することで、モデルの頑健性が大幅に向上することも確認した。我々の観察に基づき、CLIPモデルに基づく効果的なロバスト性向上戦略を設計する。本研究から得られた知見は、精度と現実のロバストネスをシームレスに組み合わせた将来のBEVモデルの開発に道を開くものである。

要約(オリジナル)

Recent advancements in bird’s eye view (BEV) representations have shown remarkable promise for in-vehicle 3D perception. However, while these methods have achieved impressive results on standard benchmarks, their robustness in varied conditions remains insufficiently assessed. In this study, we present RoboBEV, an extensive benchmark suite designed to evaluate the resilience of BEV algorithms. This suite incorporates a diverse set of camera corruption types, each examined over three severity levels. Our benchmarks also consider the impact of complete sensor failures that occur when using multi-modal models. Through RoboBEV, we assess 33 state-of-the-art BEV-based perception models spanning tasks like detection, map segmentation, depth estimation, and occupancy prediction. Our analyses reveal a noticeable correlation between the model’s performance on in-distribution datasets and its resilience to out-of-distribution challenges. Our experimental results also underline the efficacy of strategies like pre-training and depth-free BEV transformations in enhancing robustness against out-of-distribution data. Furthermore, we observe that leveraging extensive temporal information significantly improves the model’s robustness. Based on our observations, we design an effective robustness enhancement strategy based on the CLIP model. The insights from this study pave the way for the development of future BEV models that seamlessly combine accuracy with real-world robustness.

arxiv情報

著者	Shaoyuan Xie,Lingdong Kong,Wenwei Zhang,Jiawei Ren,Liang Pan,Kai Chen,Ziwei Liu
発行日	2025-02-01 12:49:41+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, DeepL

Benchmarking and Improving Bird’s Eye View Perception Robustness in Autonomous Driving

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー