DAOcc: 3D Object Detection Assisted Multi-Sensor Fusion for 3D Occupancy Prediction

要約

マルチセンサーフュージョンにより、自動運転やロボット工学にとって重要な 3D セマンティック占有予測の精度と堅牢性が大幅に向上します。
ただし、既存のアプローチのほとんどは、最高のパフォーマンスを達成するために大きな画像解像度と複雑なネットワークに依存しており、実際のシナリオでの適用を妨げています。
さらに、ほとんどのマルチセンサーフュージョンアプローチは、フュージョン機能の改善に重点を置いており、これらの機能の監視戦略の検討を無視しています。
この目的を達成するために、我々は、導入しやすい画像特徴抽出ネットワークと実用的な入力画像解像度を使用しながら、優れたパフォーマンスの達成を支援するために 3D オブジェクト検出監視を活用する新しいマルチモーダル占有予測フレームワークである DAOcc を提案します。
さらに、画像解像度の低下による悪影響を軽減するために、BEV 表示範囲拡張戦略を導入します。
実験結果は、DAOcc が Occ3D-nuScenes および SurroundOcc ベンチマークで新しい最先端のパフォーマンスを達成し、ResNet50 および 256*704 の入力画像解像度のみを使用しながら他の方法を大幅に上回っていることを示しています。
コードは https://github.com/AlphaPlusTT/DAOcc で利用可能になります。

要約(オリジナル)

Multi-sensor fusion significantly enhances the accuracy and robustness of 3D semantic occupancy prediction, which is crucial for autonomous driving and robotics. However, most existing approaches depend on large image resolutions and complex networks to achieve top performance, hindering their application in practical scenarios. Additionally, most multi-sensor fusion approaches focus on improving fusion features while overlooking the exploration of supervision strategies for these features. To this end, we propose DAOcc, a novel multi-modal occupancy prediction framework that leverages 3D object detection supervision to assist in achieving superior performance, while using a deployment-friendly image feature extraction network and practical input image resolution. Furthermore, we introduce a BEV View Range Extension strategy to mitigate the adverse effects of reduced image resolution. Experimental results show that DAOcc achieves new state-of-the-art performance on the Occ3D-nuScenes and SurroundOcc benchmarks, and surpasses other methods by a significant margin while using only ResNet50 and 256*704 input image resolution. Code will be made available at https://github.com/AlphaPlusTT/DAOcc.

arxiv情報

著者	Zhen Yang,Yanpeng Dong,Heng Wang,Lichao Ma,Zijian Cui,Qi Liu,Haoran Pei
発行日	2024-11-20 12:54:39+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

DAOcc: 3D Object Detection Assisted Multi-Sensor Fusion for 3D Occupancy Prediction

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー