SemPLeS: Semantic Prompt Learning for Weakly-Supervised Semantic Segmentation

要約

弱監視セマンティックセグメンテーション (WSSS) は、画像レベルの監視のみを使用してトレーニング画像データを使用してセグメンテーションモデルをトレーニングすることを目的としています。
正確なピクセルレベルのアノテーションにはアクセスできないため、既存の方法は通常、CAM のようなヒートマップを改良することによってセグメンテーションモデルをトレーニングするための疑似マスクを生成することに重点を置いています。
ただし、作成されたヒートマップは、ターゲットオブジェクトカテゴリまたは関連する同時発生する背景の識別画像領域のみをキャプチャする可能性があります。
この問題に対処するために、我々は、セマンティックプロンプト学習 for WSSS (SemPLeS) フレームワークを提案します。このフレームワークは、CLIP 空間に効果的にプロンプトを出して、セグメント化された領域とターゲットオブジェクトカテゴリの間のセマンティックな調整を強化することを学習します。
より具体的には、各ターゲットオブジェクトカテゴリに関連付けられた画像の背景を適切に説明および抑制するプロンプトを学習するために、対比プロンプト学習とクラス関連のセマンティック洗練を提案します。
このようにして、私たちが提案するフレームワークは、オブジェクト領域と関連するテキストラベルの間でより適切なセマンティックマッチングを実行でき、その結果、セグメンテーションモデルのトレーニングに必要な疑似マスクが得られます。
提案された SemPLeS フレームワークは、標準 WSSS ベンチマークである PASCAL VOC および MS COCO で SOTA パフォーマンスを達成し、学習したプロンプトのセマンティック視覚化による解釈可能性を実証しました。
コードは公開されます。

要約(オリジナル)

Weakly-Supervised Semantic Segmentation (WSSS) aims to train segmentation models using training image data with only image-level supervision. Since precise pixel-level annotations are not accessible, existing methods typically focus on producing pseudo masks for training segmentation models by refining CAM-like heatmaps. However, the produced heatmaps may only capture discriminative image regions of target object categories or the associated co-occurring backgrounds. To address the issues, we propose a Semantic Prompt Learning for WSSS (SemPLeS) framework, which learns to effectively prompt the CLIP space to enhance the semantic alignment between the segmented regions and the target object categories. More specifically, we propose Contrastive Prompt Learning and Class-associated Semantic Refinement to learn the prompts that adequately describe and suppress the image backgrounds associated with each target object category. In this way, our proposed framework is able to perform better semantic matching between object regions and the associated text labels, resulting in desired pseudo masks for training the segmentation model. The proposed SemPLeS framework achieves SOTA performance on the standard WSSS benchmarks, PASCAL VOC and MS COCO, and demonstrated interpretability with the semantic visualization of our learned prompts. The codes will be released.

arxiv情報

著者	Ci-Siang Lin,Chien-Yi Wang,Yu-Chiang Frank Wang,Min-Hung Chen
発行日	2024-01-22 09:41:05+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

SemPLeS: Semantic Prompt Learning for Weakly-Supervised Semantic Segmentation

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー