SparseFusion: Fusing Multi-Modal Sparse Representations for Multi-Sensor 3D Object Detection

要約

タイトル：SparseFusion：マルチセンサー3D物体検出のためのマルチモーダルスパース表現の融合

要約：
– 既存のLiDAR-カメラ3D物体検出方法の4つの重要なコンポーネント（LiDARとカメラの候補、変換、およびフュージョン出力）を特定することにより、全ての既存の方法は、密度の濃い候補を見つけるか、シーンの密度の高い表現を生成することがわかりました。
– しかしながら、物体がシーンの一部分しか占めないことを考えると、密度の濃い候補を見つけたり、密度の濃い表現を生成することはノイズが多く効率的ではありません。
– SparseFusionは、独占的にスパース候補とスパース表現を使用する新しいマルチセンサー3D検出方法です。
– 具体的には、SparseFusionは、LiDARおよびカメラモダリティの並列検出器の出力をスパース候補として使用します。
– カメラの候補をオブジェクト表現を分離してLiDAR座標空間に変換し、軽量の自己注意モジュールでマルチモダル候補を統合的な3D空間に結合することができます。
– モダリティ間の負の転送を緩和するために、モダリティ固有の検出器の前に新しい意味的および幾何学的クロスモダリティ転送モジュールを提案しています。
– SparseFusionは、nuScenesベンチマークで最高の性能を発揮しながら、強力なバックボーンを持つ方法を上回り、最速の速度で実行されます。
– モジュールと全体的な方法のパイプラインの効果と効率性を示すために、広範な実験を行いました。
– SparseFusionのコードは、https://github.com/yichen928/SparseFusionで公開されます。

要約(オリジナル)

By identifying four important components of existing LiDAR-camera 3D object detection methods (LiDAR and camera candidates, transformation, and fusion outputs), we observe that all existing methods either find dense candidates or yield dense representations of scenes. However, given that objects occupy only a small part of a scene, finding dense candidates and generating dense representations is noisy and inefficient. We propose SparseFusion, a novel multi-sensor 3D detection method that exclusively uses sparse candidates and sparse representations. Specifically, SparseFusion utilizes the outputs of parallel detectors in the LiDAR and camera modalities as sparse candidates for fusion. We transform the camera candidates into the LiDAR coordinate space by disentangling the object representations. Then, we can fuse the multi-modality candidates in a unified 3D space by a lightweight self-attention module. To mitigate negative transfer between modalities, we propose novel semantic and geometric cross-modality transfer modules that are applied prior to the modality-specific detectors. SparseFusion achieves state-of-the-art performance on the nuScenes benchmark while also running at the fastest speed, even outperforming methods with stronger backbones. We perform extensive experiments to demonstrate the effectiveness and efficiency of our modules and overall method pipeline. Our code will be made publicly available at https://github.com/yichen928/SparseFusion.

arxiv情報

著者	Yichen Xie,Chenfeng Xu,Marie-Julie Rakotosaona,Patrick Rim,Federico Tombari,Kurt Keutzer,Masayoshi Tomizuka,Wei Zhan
発行日	2023-04-27 17:17:39+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, OpenAI

SparseFusion: Fusing Multi-Modal Sparse Representations for Multi-Sensor 3D Object Detection

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー