MSF3DDETR: Multi-Sensor Fusion 3D Detection Transformer for Autonomous Driving

要約

3D オブジェクトの検出は、自動運転にとって重要なタスクです。
最近、ビジョントランスフォーマーの進歩に伴い、2D オブジェクト検出の問題は、セット間損失で処理されています。
これらの 2D オブジェクト検出のアプローチとマルチビュー 3D オブジェクト検出 DETR3D のアプローチに触発されて、MSF3DDETR を提案します。これは、画像と LiDAR 機能を融合して検出精度を向上させるマルチセンサーフュージョン 3D 検出トランスアーキテクチャです。
エンドツーエンドのシングルステージ、アンカーフリー、NMS フリーのネットワークは、マルチビュー画像と LiDAR ポイントクラウドを取り込み、3D バウンディングボックスを予測します。
まず、新しい MSF3DDETR クロスアテンションブロックを使用して、データから学習したオブジェクトクエリを画像および LiDAR 機能にリンクします。
次に、オブジェクトクエリは、マルチヘッドセルフアテンションブロックで相互に作用します。
最後に、MSF3DDETR ブロックが $L$ 回繰り返され、オブジェクトクエリが改良されます。
MSF3DDETR ネットワークは、ハンガリアンアルゴリズムベースの 2 部マッチングと DETR に触発されたセット間の損失を使用して、nuScenes データセットでエンドツーエンドでトレーニングされます。
最先端のアプローチに匹敵する定量的および定性的な結果を提示します。

要約(オリジナル)

3D object detection is a significant task for autonomous driving. Recently with the progress of vision transformers, the 2D object detection problem is being treated with the set-to-set loss. Inspired by these approaches on 2D object detection and an approach for multi-view 3D object detection DETR3D, we propose MSF3DDETR: Multi-Sensor Fusion 3D Detection Transformer architecture to fuse image and LiDAR features to improve the detection accuracy. Our end-to-end single-stage, anchor-free and NMS-free network takes in multi-view images and LiDAR point clouds and predicts 3D bounding boxes. Firstly, we link the object queries learnt from data to the image and LiDAR features using a novel MSF3DDETR cross-attention block. Secondly, the object queries interacts with each other in multi-head self-attention block. Finally, MSF3DDETR block is repeated for $L$ number of times to refine the object queries. The MSF3DDETR network is trained end-to-end on the nuScenes dataset using Hungarian algorithm based bipartite matching and set-to-set loss inspired by DETR. We present both quantitative and qualitative results which are competitive to the state-of-the-art approaches.

arxiv情報

著者	Gopi Krishna Erabati,Helder Araujo
発行日	2022-10-27 10:55:15+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

MSF3DDETR: Multi-Sensor Fusion 3D Detection Transformer for Autonomous Driving

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー