TLCFuse: Temporal Multi-Modality Fusion Towards Occlusion-Aware Semantic Segmentation-Aided Motion Planning

要約

自動運転では、オクルージョンシナリオに対処することが重要ですが、困難です。
遮蔽に対処し、動作計画を支援するには、堅牢な周囲の知覚が不可欠です。
最先端のモデルは、LiDAR とカメラのデータを融合して印象的な知覚結果を生み出しますが、遮蔽された物体の検出は依然として困難です。
この論文では、この課題に対処するためにこれらのモダリティと並行して時間的手がかりを統合することによって、時間的手がかりの重要な役割を強調します。
我々は、連続センサーデータを活用してオクルージョンに対する堅牢性を実現する、鳥瞰図セマンティックグリッドセグメンテーションのための新しいアプローチを提案します。
私たちのモデルは、アテンション操作を使用してセンサーの読み取り値から情報を抽出し、この情報を低次元の潜在表現に集約することで、各予測ステップでのマルチステップ入力の処理を可能にします。
さらに、それを交通シーンの展開の予測に直接適用し、軌道計画のためのモーションプランナーにシームレスに統合する方法を示します。
セマンティックセグメンテーションタスクでは、nuScenes データセットでモデルを評価し、他のベースラインよりも優れたパフォーマンスを示し、特に遮蔽された車両および部分的に遮蔽された車両での評価で大きな差が見られることを示しました。
さらに、動作計画タスクに関しては、動作計画用の最先端の大規模データセットである nuPlan でトレーニングおよび評価を行った初期のチームの 1 つです。

要約(オリジナル)

In autonomous driving, addressing occlusion scenarios is crucial yet challenging. Robust surrounding perception is essential for handling occlusions and aiding motion planning. State-of-the-art models fuse Lidar and Camera data to produce impressive perception results, but detecting occluded objects remains challenging. In this paper, we emphasize the crucial role of temporal cues by integrating them alongside these modalities to address this challenge. We propose a novel approach for bird’s eye view semantic grid segmentation, that leverages sequential sensor data to achieve robustness against occlusions. Our model extracts information from the sensor readings using attention operations and aggregates this information into a lower-dimensional latent representation, enabling thus the processing of multi-step inputs at each prediction step. Moreover, we show how it can also be directly applied to forecast the development of traffic scenes and be seamlessly integrated into a motion planner for trajectory planning. On the semantic segmentation tasks, we evaluate our model on the nuScenes dataset and show that it outperforms other baselines, with particularly large differences when evaluating on occluded and partially-occluded vehicles. Additionally, on motion planning task we are among the early teams to train and evaluate on nuPlan, a cutting-edge large-scale dataset for motion planning.

arxiv情報

著者	Gustavo Salazar-Gomez,Wenqian Liu,Manuel Diaz-Zapata,David Sierra-Gonzalez,Christian Laugier
発行日	2024-11-25 18:59:25+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

TLCFuse: Temporal Multi-Modality Fusion Towards Occlusion-Aware Semantic Segmentation-Aided Motion Planning

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー