4D Contrastive Superflows are Dense 3D Representation Learners

要約

自動運転の分野では、正確な 3D 認識が基礎となります。
ただし、このようなモデルの開発は人間による広範な注釈に依存しており、このプロセスはコストと労力の両方がかかります。
データ表現学習の観点からこの課題に対処するために、時空間的な事前トレーニング目標を確立するために連続した LiDAR カメラのペアを活用するように設計された新しいフレームワークである SuperFlow を導入します。
SuperFlow は、2 つの主要な設計を統合することで際立っています。1) 特徴の学習中に点群密度の変動に対する鈍感さを促進する密から疎への一貫性正則化、2) 意味のある時間的手がかりを抽出するために慎重に作成されたフローベースの対照学習モジュール
すぐに利用できるセンサー校正から。
学習効率をさらに高めるために、カメラビューから抽出された知識の調整を強化するプラグアンドプレイビュー一貫性モジュールを組み込みました。
11 の異種 LiDAR データセットにわたる広範な比較およびアブレーション研究により、当社の有効性と優位性が検証されました。
さらに、事前トレーニング中に 2D および 3D バックボーンをスケールアップすることで、いくつかの興味深い新たな特性を観察し、LiDAR ベースの知覚のための 3D 基礎モデルの将来の研究に光を当てます。

要約(オリジナル)

In the realm of autonomous driving, accurate 3D perception is the foundation. However, developing such models relies on extensive human annotations — a process that is both costly and labor-intensive. To address this challenge from a data representation learning perspective, we introduce SuperFlow, a novel framework designed to harness consecutive LiDAR-camera pairs for establishing spatiotemporal pretraining objectives. SuperFlow stands out by integrating two key designs: 1) a dense-to-sparse consistency regularization, which promotes insensitivity to point cloud density variations during feature learning, and 2) a flow-based contrastive learning module, carefully crafted to extract meaningful temporal cues from readily available sensor calibrations. To further boost learning efficiency, we incorporate a plug-and-play view consistency module that enhances the alignment of the knowledge distilled from camera views. Extensive comparative and ablation studies across 11 heterogeneous LiDAR datasets validate our effectiveness and superiority. Additionally, we observe several interesting emerging properties by scaling up the 2D and 3D backbones during pretraining, shedding light on the future research of 3D foundation models for LiDAR-based perception.

arxiv情報

著者	Xiang Xu,Lingdong Kong,Hui Shuai,Wenwei Zhang,Liang Pan,Kai Chen,Ziwei Liu,Qingshan Liu
発行日	2024-07-08 17:59:54+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

4D Contrastive Superflows are Dense 3D Representation Learners

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー