Diffusion Visual Counterfactual Explanations

要約

Visual Counterfactual Explanations (VCE) は、画像分類器の決定を理解するための重要なツールです。
それらは、分類子の決定を変更する画像の「小さい」が「現実的な」セマンティックな変更です。
VCE を生成するための現在のアプローチは、敵対的に堅牢なモデルに制限されており、多くの場合、非現実的なアーティファクトが含まれているか、クラスが少ない画像分類問題に制限されています。
このホワイトペーパーでは、拡散プロセスを介して任意の ImageNet 分類子の拡散視覚的反事実説明 (DVCE) を生成することで、これを克服します。
拡散プロセスに対する 2 つの変更は、DVCE の鍵となります。1 つ目は、距離の正則化と拡散プロセスの遅い開始とともに、ハイパーパラメータが画像とモデル全体で一般化される適応パラメータ化により、元の画像に対するセマンティックな変更を最小限に抑えて画像を生成できます。
1つですが、分類が異なります。
第二に、敵対的にロバストなモデルによる円錐正則化により、拡散プロセスが些細な非意味論的変化に収束せず、代わりに分類器による高い信頼を達成するターゲットクラスの現実的な画像が生成されることが保証されます。

要約(オリジナル)

Visual Counterfactual Explanations (VCEs) are an important tool to understand the decisions of an image classifier. They are ‘small’ but ‘realistic’ semantic changes of the image changing the classifier decision. Current approaches for the generation of VCEs are restricted to adversarially robust models and often contain non-realistic artefacts, or are limited to image classification problems with few classes. In this paper, we overcome this by generating Diffusion Visual Counterfactual Explanations (DVCEs) for arbitrary ImageNet classifiers via a diffusion process. Two modifications to the diffusion process are key for our DVCEs: first, an adaptive parameterization, whose hyperparameters generalize across images and models, together with distance regularization and late start of the diffusion process, allow us to generate images with minimal semantic changes to the original ones but different classification. Second, our cone regularization via an adversarially robust model ensures that the diffusion process does not converge to trivial non-semantic changes, but instead produces realistic images of the target class which achieve high confidence by the classifier.

arxiv情報

著者	Maximilian Augustin,Valentyn Boreiko,Francesco Croce,Matthias Hein
発行日	2022-10-21 09:35:47+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

Diffusion Visual Counterfactual Explanations

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー