Spherical Transformer for LiDAR-based 3D Recognition

要約

LiDAR ベースの 3D 点群認識は、さまざまなアプリケーションにメリットをもたらしています。
LiDAR ポイントの分布を特別に考慮しなければ、現在のほとんどの方法では、情報の切断と受容野の制限があり、特にまばらな離れたポイントの場合に問題があります。
この作業では、LiDAR ポイントの可変疎分布を研究し、SphereFormer を提示して、密集した近くのポイントから疎な遠いポイントまで情報を直接集約します。
空間を複数の重ならない狭いウィンドウと長いウィンドウに分割する放射状のウィンドウ自己注意を設計します。
切断の問題を克服し、受容野をスムーズかつ劇的に拡大し、まばらな離れたポイントのパフォーマンスを大幅に向上させます。
さらに、狭いウィンドウと長いウィンドウに適合するために、指数分割を提案して、きめ細かい位置エンコーディングと動的特徴選択を生成し、モデル表現能力を高めます。
特に、私たちの方法は、nuScenes と SemanticKITTI の両方のセマンティックセグメンテーションベンチマークで、それぞれ 81.9% と 74.8% の mIoU で 1 位にランクされています。
また、nuScenes オブジェクト検出ベンチマークで 72.8% の NDS と 68.5% の mAP で 3 位を達成しています。
コードは https://github.com/dvlab-research/SphereFormer.git で入手できます。

要約(オリジナル)

LiDAR-based 3D point cloud recognition has benefited various applications. Without specially considering the LiDAR point distribution, most current methods suffer from information disconnection and limited receptive field, especially for the sparse distant points. In this work, we study the varying-sparsity distribution of LiDAR points and present SphereFormer to directly aggregate information from dense close points to the sparse distant ones. We design radial window self-attention that partitions the space into multiple non-overlapping narrow and long windows. It overcomes the disconnection issue and enlarges the receptive field smoothly and dramatically, which significantly boosts the performance of sparse distant points. Moreover, to fit the narrow and long windows, we propose exponential splitting to yield fine-grained position encoding and dynamic feature selection to increase model representation ability. Notably, our method ranks 1st on both nuScenes and SemanticKITTI semantic segmentation benchmarks with 81.9% and 74.8% mIoU, respectively. Also, we achieve the 3rd place on nuScenes object detection benchmark with 72.8% NDS and 68.5% mAP. Code is available at https://github.com/dvlab-research/SphereFormer.git.

arxiv情報

著者	Xin Lai,Yukang Chen,Fanbin Lu,Jianhui Liu,Jiaya Jia
発行日	2023-03-22 17:30:14+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

Spherical Transformer for LiDAR-based 3D Recognition

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー