Finetuning Pre-trained Model with Limited Data for LiDAR-based 3D Object Detection by Bridging Domain Gaps

要約

LiDAR ベースの 3D 物体検出器は、自律走行車や移動ロボットなどのさまざまなアプリケーションで主に利用されています。
ただし、LiDAR ベースの検出器は、さまざまなセンサー構成 (センサーの種類、空間解像度、FOV など) や位置のシフトを伴うターゲットドメインにうまく適応できないことがよくあります。
このようなギャップを減らすには、新しいセットアップでデータセットを収集して注釈を付けることが一般に必要ですが、多くの場合、費用と時間がかかります。
最近の研究では、ラベルのない大規模な LiDAR フレームを使用して、事前トレーニングされたバックボーンを自己教師ありの方法で学習できることが示唆されています。
ただし、表現力豊かな表現にもかかわらず、ターゲットドメインからの大量のデータがなければうまく一般化することは依然として困難です。
そこで、限られたターゲットデータ (約 100 個の LiDAR フレーム) で事前トレーニングされたモデルを適応させ、その表現力を維持し、過剰適合を防ぐ、ドメイン適応型蒸留チューニング (DADT) と呼ばれる新しい方法を提案します。
具体的には、正則化機能を使用して、教師と生徒のアーキテクチャにおける事前トレーニングされたモデルと微調整されたモデルの間でオブジェクトレベルとコンテキストレベルの表現を調整します。
Waymo Open データセットや KITTI などのベンチマークを使用した実験により、私たちの方法が事前トレーニングされたモデルを効果的に微調整し、精度が大幅に向上することが確認されました。

要約(オリジナル)

LiDAR-based 3D object detectors have been largely utilized in various applications, including autonomous vehicles or mobile robots. However, LiDAR-based detectors often fail to adapt well to target domains with different sensor configurations (e.g., types of sensors, spatial resolution, or FOVs) and location shifts. Collecting and annotating datasets in a new setup is commonly required to reduce such gaps, but it is often expensive and time-consuming. Recent studies suggest that pre-trained backbones can be learned in a self-supervised manner with large-scale unlabeled LiDAR frames. However, despite their expressive representations, they remain challenging to generalize well without substantial amounts of data from the target domain. Thus, we propose a novel method, called Domain Adaptive Distill-Tuning (DADT), to adapt a pre-trained model with limited target data (approximately 100 LiDAR frames), retaining its representation power and preventing it from overfitting. Specifically, we use regularizers to align object-level and context-level representations between the pre-trained and finetuned models in a teacher-student architecture. Our experiments with driving benchmarks, i.e., Waymo Open dataset and KITTI, confirm that our method effectively finetunes a pre-trained model, achieving significant gains in accuracy.

arxiv情報

著者	Jiyun Jang,Mincheol Chang,Jongwon Park,Jinkyu Kim
発行日	2024-10-02 08:22:42+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

Finetuning Pre-trained Model with Limited Data for LiDAR-based 3D Object Detection by Bridging Domain Gaps

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー