Dataset Diffusion: Diffusion-based Synthetic Dataset Generation for Pixel-Level Semantic Segmentation

要約

深視力モデルのトレーニングデータの準備は、多大な労力を要する作業です。
これに対処するために、合成データを生成するための効果的なソリューションとして生成モデルが登場しました。
現在の生成モデルは画像レベルのカテゴリラベルを生成しますが、我々はテキストから画像への生成モデル安定拡散 (SD) を使用してピクセルレベルのセマンティックセグメンテーションラベルを生成する新しい方法を提案します。
SD のテキストプロンプト、クロスアテンション、セルフアテンションを利用することで、\textit{クラスプロンプトの追加}、\textit{クラスプロンプトクロスアテンション}、\textit{セルフアテンションという 3 つの新しいテクニックを導入します。
累乗}。
これらの技術により、合成画像に対応するセグメンテーションマップを生成できます。
これらのマップは、セマンティックセグメンターをトレーニングするための疑似ラベルとして機能し、労力を要するピクセル単位の注釈の必要性を排除します。
擬似ラベルの不完全性を考慮して、セグメンテーションに不確実領域を組み込み、それらの領域からの損失を無視できるようにします。
私たちは PASCAL VOC と MSCOCO という 2 つのデータセットに対して評価を実施しており、私たちのアプローチは同時作業を大幅に上回っています。
私たちのベンチマークとコードは https://github.com/VinAIResearch/Dataset-Diffusion でリリースされます。

要約(オリジナル)

Preparing training data for deep vision models is a labor-intensive task. To address this, generative models have emerged as an effective solution for generating synthetic data. While current generative models produce image-level category labels, we propose a novel method for generating pixel-level semantic segmentation labels using the text-to-image generative model Stable Diffusion (SD). By utilizing the text prompts, cross-attention, and self-attention of SD, we introduce three new techniques: \textit{class-prompt appending}, \textit{class-prompt cross-attention}, and \textit{self-attention exponentiation}. These techniques enable us to generate segmentation maps corresponding to synthetic images. These maps serve as pseudo-labels for training semantic segmenters, eliminating the need for labor-intensive pixel-wise annotation. To account for the imperfections in our pseudo-labels, we incorporate uncertainty regions into the segmentation, allowing us to disregard loss from those regions. We conduct evaluations on two datasets, PASCAL VOC and MSCOCO, and our approach significantly outperforms concurrent work. Our benchmarks and code will be released at https://github.com/VinAIResearch/Dataset-Diffusion

arxiv情報

著者	Quang Nguyen,Truong Vu,Anh Tran,Khoi Nguyen
発行日	2023-09-25 17:19:26+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

Dataset Diffusion: Diffusion-based Synthetic Dataset Generation for Pixel-Level Semantic Segmentation

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー