Valeo4Cast: A Modular Approach to End-to-End Forecasting


エンドツーエンドの予測では、モデルはセンサー データ (カメラまたは LiDAR) からシーンのさまざまな要素の位置と過去の軌跡を共同で検出し、それらの将来の位置を予測する必要があります。
その結果、当社のソリューションは、自動運転 (WAD) に関する CVPR 2024 ワークショップで開催された Argoverse 2 エンドツーエンド予測チャレンジで 63.82 mAPf を獲得し、1 位にランクされました。


Motion forecasting is crucial in autonomous driving systems to anticipate the future trajectories of surrounding agents such as pedestrians, vehicles, and traffic signals. In end-to-end forecasting, the model must jointly detect from sensor data (cameras or LiDARs) the position and past trajectories of the different elements of the scene and predict their future location. We depart from the current trend of tackling this task via end-to-end training from perception to forecasting and we use a modular approach instead. Following a recent study, we individually build and train detection, tracking, and forecasting modules. We then only use consecutive finetuning steps to integrate the modules better and alleviate compounding errors. Our study reveals that this simple yet effective approach significantly improves performance on the end-to-end forecasting benchmark. Consequently, our solution ranks first in the Argoverse 2 end-to-end Forecasting Challenge held at CVPR 2024 Workshop on Autonomous Driving (WAD), with 63.82 mAPf. We surpass forecasting results by +17.1 points over last year’s winner and by +13.3 points over this year’s runner-up. This remarkable performance in forecasting can be explained by our modular paradigm, which integrates finetuning strategies and significantly outperforms the end-to-end-trained counterparts.


著者 Yihong Xu,Éloi Zablocki,Alexandre Boulch,Gilles Puy,Mickael Chen,Florent Bartoccioni,Nermin Samet,Oriane Siméoni,Spyros Gidaris,Tuan-Hung Vu,Andrei Bursuc,Eduardo Valle,Renaud Marlet,Matthieu Cord
発行日 2024-06-12 11:50:51+00:00
arxivサイト arxiv_id(pdf)

提供元, 利用サービス, Google

カテゴリー: cs.CV, cs.RO パーマリンク