Weakly Supervised Point Clouds Transformer for 3D Object Detection

要約

3D データセットのアノテーションは、シーン理解におけるセマンティックセグメンテーションとオブジェクト検出に必要です。
この論文では、3D オブジェクト検出に使用される点群変換器を弱く監視するためのフレームワークを紹介します。
目的は、3D データセットにアノテーションを付けるにはコストがかかるため、トレーニングに必要な監視の量を減らすことです。
我々は、ランダムにプリセットされたアンカーポイントを学習し、投票ネットワークを使用して準備された高品質のアンカーポイントを選択する教師なし投票提案モジュールを提案します。
次に、情報を生徒と教師のネットワークに抽出します。
学生ネットワークに関しては、地域の特徴を効率的に抽出するためにResNetネットワークを適用します。
ただし、多くのグローバル情報が失われる可能性もあります。
学生ネットワークの入力としてグローバル情報とローカル情報を組み込んだ入力を提供するために、トランスフォーマーの自己注意メカニズムを採用してグローバル特徴を抽出し、ResNet 層を使用して地域提案を抽出します。
教師ネットワークは、ImageNet 上の事前トレーニング済みモデルを使用して、生徒ネットワークの分類と回帰を監視します。
挑戦的な KITTI データセットにおいて、実験結果は、最新の弱教師付き 3D 物体検出器と比較して最高レベルの平均精度を達成しました。

要約(オリジナル)

The annotation of 3D datasets is required for semantic-segmentation and object detection in scene understanding. In this paper we present a framework for the weakly supervision of a point clouds transformer that is used for 3D object detection. The aim is to decrease the required amount of supervision needed for training, as a result of the high cost of annotating a 3D datasets. We propose an Unsupervised Voting Proposal Module, which learns randomly preset anchor points and uses voting network to select prepared anchor points of high quality. Then it distills information into student and teacher network. In terms of student network, we apply ResNet network to efficiently extract local characteristics. However, it also can lose much global information. To provide the input which incorporates the global and local information as the input of student networks, we adopt the self-attention mechanism of transformer to extract global features, and the ResNet layers to extract region proposals. The teacher network supervises the classification and regression of the student network using the pre-trained model on ImageNet. On the challenging KITTI datasets, the experimental results have achieved the highest level of average precision compared with the most recent weakly supervised 3D object detectors.

arxiv情報

著者	Zuojin Tang,Bo Sun,Tongwei Ma,Daosheng Li,Zhenhui Xu
発行日	2023-09-08 03:56:34+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

Weakly Supervised Point Clouds Transformer for 3D Object Detection

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー