Efficient 3D Perception on Multi-Sweep Point Cloud with Gumbel Spatial Pruning

要約

この論文では、屋外環境における点群の知覚を研究します。
既存の方法では、屋外の点群のまばらな性質により、遠くにある物体や遮蔽された物体を認識する際に限界に直面しています。
この研究では、時間的に連続する複数の LiDAR スイープを蓄積することでこの問題が大幅に軽減され、その結果、知覚精度が大幅に向上することが観察されました。
ただし、計算コストも増加するため、これまでのアプローチでは多数の LiDAR スイープを利用できなくなります。
この課題に取り組むために、蓄積された点群内の点のかなりの部分が冗長であり、これらの点を破棄しても認識精度への影響が最小限に抑えられることがわかりました。
学習されたエンドツーエンドのサンプリングに基づいてポイントを動的にプルーニングする、シンプルかつ効果的なガンベル空間プルーニング (GSP) レイヤーを紹介します。
GSP 層は他のネットワークコンポーネントから分離されているため、既存の点群ネットワークアーキテクチャにシームレスに統合できます。
追加の計算オーバーヘッドを発生させることなく、LiDAR スイープの数を一般的な 10 回から 40 回まで増やしました。その結果、知覚パフォーマンスが大幅に向上しました。
たとえば、nuScenes 3D オブジェクト検出および BEV マップセグメンテーションタスクでは、プルーニング戦略により、標準的な TransL ベースラインやその他のベースライン手法が改善されます。

要約(オリジナル)

This paper studies point cloud perception within outdoor environments. Existing methods face limitations in recognizing objects located at a distance or occluded, due to the sparse nature of outdoor point clouds. In this work, we observe a significant mitigation of this problem by accumulating multiple temporally consecutive LiDAR sweeps, resulting in a remarkable improvement in perception accuracy. However, the computation cost also increases, hindering previous approaches from utilizing a large number of LiDAR sweeps. To tackle this challenge, we find that a considerable portion of points in the accumulated point cloud is redundant, and discarding these points has minimal impact on perception accuracy. We introduce a simple yet effective Gumbel Spatial Pruning (GSP) layer that dynamically prunes points based on a learned end-to-end sampling. The GSP layer is decoupled from other network components and thus can be seamlessly integrated into existing point cloud network architectures. Without incurring additional computational overhead, we increase the number of LiDAR sweeps from 10, a common practice, to as many as 40. Consequently, there is a significant enhancement in perception performance. For instance, in nuScenes 3D object detection and BEV map segmentation tasks, our pruning strategy improves the vanilla TransL baseline and other baseline methods.

arxiv情報

著者	Jianhao Li,Tianyu Sun,Xueqian Zhang,Zhongdao Wang,Bailan Feng,Hengshuang Zhao
発行日	2024-11-12 12:07:27+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

Efficient 3D Perception on Multi-Sweep Point Cloud with Gumbel Spatial Pruning

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー