Generalized Few-shot 3D Point Cloud Segmentation with Vision-Language Model

要約

一般化された少数のショット3Dポイントクラウドセグメンテーション（GFS-PCS）は、ベースクラスのセグメンテーションを保持しながら、サポートサンプルがほとんどない新しいクラスにモデルを適応させます。
既存のGFS-PCSメソッドは、サポート機能やクエリ機能との対話を介してプロトタイプを強化しますが、少ないショットサンプルからのまばらな知識によって制限されたままです。
一方、オープンワールドの小説クラス全体に一般化する3Dビジョン言語モデル（3D VLMS）には、豊かではあるが騒々しい斬新なクラスの知識が含まれています。
この作業では、GFS-VLという名前の両方の強度を最大化するために、正確でありながらまばらな少数のサンプルを使用して、3D VLMSから密集したが騒々しい擬似ラベルを相乗的にするGFS-PCSフレームワークを導入します。
具体的には、低品質の領域をフィルタリングするためにプロトタイプ誘導の擬似ラベル選択を提示し、それに続いて、擬似ラベルのコンテキストと少数のショットサンプルからの知識を組み合わせて、フィルター処理された非標識領域に適応的にラベルを付ける適応浸透戦略が続きます。
さらに、少数のショットサンプルをトレーニングシーンに埋め込むための新しいベースミックス戦略を設計し、改善された新しいクラス学習のための本質的なコンテキストを維持します。
さらに、現在のGFS-PCSベンチマークの限られた多様性を認識して、包括的な一般化評価のために多様な新しいクラスを備えた2つの挑戦的なベンチマークを導入します。
実験では、モデルとデータセット全体のフレームワークの有効性を検証します。
私たちのアプローチとベンチマークは、現実の世界でGFS-PCSを前進させるための強固な基盤を提供します。
コードはhttps://github.com/zhaochongan/gfs-vlにあります

要約(オリジナル)

Generalized few-shot 3D point cloud segmentation (GFS-PCS) adapts models to new classes with few support samples while retaining base class segmentation. Existing GFS-PCS methods enhance prototypes via interacting with support or query features but remain limited by sparse knowledge from few-shot samples. Meanwhile, 3D vision-language models (3D VLMs), generalizing across open-world novel classes, contain rich but noisy novel class knowledge. In this work, we introduce a GFS-PCS framework that synergizes dense but noisy pseudo-labels from 3D VLMs with precise yet sparse few-shot samples to maximize the strengths of both, named GFS-VL. Specifically, we present a prototype-guided pseudo-label selection to filter low-quality regions, followed by an adaptive infilling strategy that combines knowledge from pseudo-label contexts and few-shot samples to adaptively label the filtered, unlabeled areas. Additionally, we design a novel-base mix strategy to embed few-shot samples into training scenes, preserving essential context for improved novel class learning. Moreover, recognizing the limited diversity in current GFS-PCS benchmarks, we introduce two challenging benchmarks with diverse novel classes for comprehensive generalization evaluation. Experiments validate the effectiveness of our framework across models and datasets. Our approach and benchmarks provide a solid foundation for advancing GFS-PCS in the real world. The code is at https://github.com/ZhaochongAn/GFS-VL

arxiv情報

著者	Zhaochong An,Guolei Sun,Yun Liu,Runjia Li,Junlin Han,Ender Konukoglu,Serge Belongie
発行日	2025-03-20 16:10:33+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

Generalized Few-shot 3D Point Cloud Segmentation with Vision-Language Model

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー