OC3D: Weakly Supervised Outdoor 3D Object Detection with Only Coarse Click Annotation

要約

LiDAR ベースの屋外 3D 物体検出は広く注目を集めています。
ただし、LiDAR 点群からの 3D 検出器のトレーニングは、通常、高価な境界ボックスアノテーションに依存します。
この論文では、3D 点群の鳥瞰図を大まかにクリックするだけで済む、革新的な弱教師あり手法である OC3D について説明します。
ここでの重要な課題は、このような単純なクリック注釈からはターゲットオブジェクトの完全な幾何学的記述が存在しないことです。
この問題に対処するために、私たちが提案する OC3D は 2 段階の戦略を採用しています。
最初の段階では、最初に新しい動的および静的分類戦略を設計し、静的インスタンスと動的インスタンスのそれぞれにボックスレベルとマスクレベルの疑似ラベルを生成する Click2Box モジュールと Click2Mask モジュールを提案します。
第 2 段階では、ニューラルネットワークの学習機能を利用して、含まれる情報が少ないマスクレベルの疑似ラベルをボックスレベルの疑似ラベルに更新する Mask2Box モジュールを設計します。
広く使用されている KITTI および nuScenes データセットの実験結果は、粗いクリックだけを使用した OC3D が、弱く教師付きの 3D 検出方法と比較して最先端のパフォーマンスを達成することを示しています。
OC3D と欠落クリックマイニング戦略を組み合わせて、完全に教師ありの手法と同等のパフォーマンスを達成するために KITTI データセットのアノテーションコストをわずか 0.2% しか必要としない OC3D++ パイプラインを提案します。
コードは公開されます。

要約(オリジナル)

LiDAR-based outdoor 3D object detection has received widespread attention. However, training 3D detectors from the LiDAR point cloud typically relies on expensive bounding box annotations. This paper presents OC3D, an innovative weakly supervised method requiring only coarse clicks on the bird’s eye view of the 3D point cloud. A key challenge here is the absence of complete geometric descriptions of the target objects from such simple click annotations. To address this problem, our proposed OC3D adopts a two-stage strategy. In the first stage, we initially design a novel dynamic and static classification strategy and then propose the Click2Box and Click2Mask modules to generate box-level and mask-level pseudo-labels for static and dynamic instances, respectively. In the second stage, we design a Mask2Box module, leveraging the learning capabilities of neural networks to update mask-level pseudo-labels, which contain less information, to box-level pseudo-labels. Experimental results on the widely used KITTI and nuScenes datasets demonstrate that our OC3D with only coarse clicks achieves state-of-the-art performance compared to weakly-supervised 3D detection methods. Combining OC3D with a missing click mining strategy, we propose an OC3D++ pipeline, which requires only 0.2% annotation cost in the KITTI dataset to achieve performance comparable to fully supervised methods. The code will be made publicly available.

arxiv情報

著者	Qiming Xia,Hongwei Lin,Wei Ye,Hai Wu,Yadan Luo,Shijia Zhao,Xin Li,Chenglu Wen
発行日	2024-08-16 02:28:12+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

OC3D: Weakly Supervised Outdoor 3D Object Detection with Only Coarse Click Annotation

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー