SemFormer: Semantic Guided Activation Transformer for Weakly Supervised Semantic Segmentation

要約

最近の主流の弱い教師ありセマンティックセグメンテーション (WSSS) アプローチは、主に、CNN (畳み込みニューラルネットワーク) ベースの画像分類器によって生成されるクラスアクティベーションマップ (CAM) に基づいています。
このホワイトペーパーでは、WSSS 用に、Semantic Guided Activation Transformer (SemFormer) という新しいトランスフォーマーベースのフレームワークを提案します。
入力画像のクラス埋め込みを抽出し、データセットのすべてのクラスのクラスセマンティクスを学習するために、トランスフォーマーベースの Class-Aware AutoEncoder (CAAE) を設計します。
次に、クラス埋め込みと学習したクラスセマンティクスを使用して、4 つの損失、つまり、クラス前景、クラス背景、活性化抑制、および活性化補完損失を伴う活性化マップの生成を導きます。
実験結果は、私たちの SemFormer が \textbf{74.3}\% mIoU を達成し、PASCAL VOC 2012 データセットで多くの最近の主流の WSSS アプローチを大幅に上回っていることを示しています。
コードは \url{https://github.com/JLChen-C/SemFormer} で入手できます。

要約(オリジナル)

Recent mainstream weakly supervised semantic segmentation (WSSS) approaches are mainly based on Class Activation Map (CAM) generated by a CNN (Convolutional Neural Network) based image classifier. In this paper, we propose a novel transformer-based framework, named Semantic Guided Activation Transformer (SemFormer), for WSSS. We design a transformer-based Class-Aware AutoEncoder (CAAE) to extract the class embeddings for the input image and learn class semantics for all classes of the dataset. The class embeddings and learned class semantics are then used to guide the generation of activation maps with four losses, i.e., class-foreground, class-background, activation suppression, and activation complementation loss. Experimental results show that our SemFormer achieves \textbf{74.3}\% mIoU and surpasses many recent mainstream WSSS approaches by a large margin on PASCAL VOC 2012 dataset. Code will be available at \url{https://github.com/JLChen-C/SemFormer}.

arxiv情報

著者	Junliang Chen,Xiaodong Zhao,Cheng Luo,Linlin Shen
発行日	2022-10-26 10:51:20+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

SemFormer: Semantic Guided Activation Transformer for Weakly Supervised Semantic Segmentation

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー