Canonical Latent Representations in Conditional Diffusion Models

要約

条件付き拡散モデル（CDM）は、さまざまな生成タスクで印象的なパフォーマンスを示しています。
完全なデータ分布をモデル化する能力は、下流の識別学習における分析による新しい手段を開きました。
ただし、この同じモデリング容量により、CDMは無関係なコンテキストでクラスを定義する機能を巻き込み、堅牢で解釈可能な表現を抽出するための課題を提起します。
この目的のために、内部CDMが非差別的な信号を破棄しながら重要なカテゴリ情報を保持している潜在コードである標準的な潜在表現（CLAREPS）を特定します。
デコードされると、Clarepsは各クラスの代表的なサンプルを生成し、最小限の無関係な詳細を備えたコアクラスセマンティクスの解釈可能でコンパクトな要約を提供します。
クラレップを利用して、新しい拡散ベースの特徴留置パラダイム、Cadistillを開発します。
生徒はトレーニングセットに完全にアクセスできますが、CDMは教師としてのClarepsを介してのみコアクラスの知識を転送します。これは、サイズのトレーニングデータの10％にすぎません。
トレーニング後、学生は強い敵対的な堅牢性と一般化能力を達成し、偽の背景キューの代わりにクラス信号に重点を置いています。
私たちの調査結果は、CDMが画像ジェネレーターとしてだけでなく、堅牢な表現学習を駆動できるコンパクトで解釈可能な教師としても機能できることを示唆しています。

要約(オリジナル)

Conditional diffusion models (CDMs) have shown impressive performance across a range of generative tasks. Their ability to model the full data distribution has opened new avenues for analysis-by-synthesis in downstream discriminative learning. However, this same modeling capacity causes CDMs to entangle the class-defining features with irrelevant context, posing challenges to extracting robust and interpretable representations. To this end, we identify Canonical LAtent Representations (CLAReps), latent codes whose internal CDM features preserve essential categorical information while discarding non-discriminative signals. When decoded, CLAReps produce representative samples for each class, offering an interpretable and compact summary of the core class semantics with minimal irrelevant details. Exploiting CLAReps, we develop a novel diffusion-based feature-distillation paradigm, CaDistill. While the student has full access to the training set, the CDM as teacher transfers core class knowledge only via CLAReps, which amounts to merely 10 % of the training data in size. After training, the student achieves strong adversarial robustness and generalization ability, focusing more on the class signals instead of spurious background cues. Our findings suggest that CDMs can serve not just as image generators but also as compact, interpretable teachers that can drive robust representation learning.

arxiv情報

著者	Yitao Xu,Tong Zhang,Ehsan Pajouheshgar,Sabine Süsstrunk
発行日	2025-06-11 17:28:52+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

Canonical Latent Representations in Conditional Diffusion Models

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー