TTT-KD: Test-Time Training for 3D Semantic Segmentation through Knowledge Distillation from Foundation Models

要約

テストタイムトレーニング (TTT) は、事前トレーニングされたネットワークを、変化するデータ分布にオンザフライで適応させることを提案します。
この研究では、3D セマンティックセグメンテーションのための最初の TTT 手法である TTT-KD を提案します。これは、テスト時の分布シフトに適応するための自己教師あり目標として、基盤モデル (DINOv2 など) からの知識蒸留 (KD) をモデル化します。
ペアの画像とポイントクラウド (2D-3D) データへのアクセスが与えられた場合、まず、ポイントクラウドを使用したセマンティックセグメンテーションの主要タスクと、既製の 2D $\to$ 3D KD タスクのために 3D セグメンテーションバックボーンを最適化します。
2D 事前トレーニング済み基礎モデル。
テスト時、当社の TTT-KD は、最終予測を実行する前に、知識蒸留の自己監視タスクを使用して、各テストサンプルの 3D セグメンテーションバックボーンを更新します。
複数の屋内および屋外の 3D セグメンテーションベンチマークに関する広範な評価により、配布内 (ID) と配布外 (ODO) の両方のテストデータセットのパフォーマンスが向上するため、TTT-KD の有用性が示されています。
トレーニングとテストの分布が類似している場合は最大 13% (平均 7%) の mIoU の増加を達成し、OOD テストサンプルに適応させる場合は最大 45% (平均 20%) の mIoU の増加を達成します。

要約(オリジナル)

Test-Time Training (TTT) proposes to adapt a pre-trained network to changing data distributions on-the-fly. In this work, we propose the first TTT method for 3D semantic segmentation, TTT-KD, which models Knowledge Distillation (KD) from foundation models (e.g. DINOv2) as a self-supervised objective for adaptation to distribution shifts at test-time. Given access to paired image-pointcloud (2D-3D) data, we first optimize a 3D segmentation backbone for the main task of semantic segmentation using the pointclouds and the task of 2D $\to$ 3D KD by using an off-the-shelf 2D pre-trained foundation model. At test-time, our TTT-KD updates the 3D segmentation backbone for each test sample, by using the self-supervised task of knowledge distillation, before performing the final prediction. Extensive evaluations on multiple indoor and outdoor 3D segmentation benchmarks show the utility of TTT-KD, as it improves performance for both in-distribution (ID) and out-of-distribution (ODO) test datasets. We achieve a gain of up to 13% mIoU (7% on average) when the train and test distributions are similar and up to 45% (20% on average) when adapting to OOD test samples.

arxiv情報

著者	Lisa Weijler,Muhammad Jehanzeb Mirza,Leon Sick,Can Ekkazan,Pedro Hermosilla
発行日	2024-03-18 11:41:55+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

TTT-KD: Test-Time Training for 3D Semantic Segmentation through Knowledge Distillation from Foundation Models

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー