Mixture of Gaussian-distributed Prototypes with Generative Modelling for Interpretable and Trustworthy Image Recognition

要約

ProtoPNet などのプロトタイプ部分メソッドは、予測をトレーニングプロトタイプにリンクすることで画像認識の解釈可能性を高め、それによって意思決定に対する直感的な洞察を提供します。
プロトタイプのポイントベースの学習に依存する既存の手法は、一般に 2 つの重大な問題に直面しています。1) 学習されたプロトタイプの表現力は限られており、配布外 (OoD) 入力の検出には適しておらず、その決定の信頼性が低下します。
2) 学習されたプロトタイプをトレーニング画像の空間に投影し直す必要があるため、予測パフォーマンスが大幅に低下します。
さらに、現在のプロトタイプ学習では、トレーニング中に最もアクティブなオブジェクト部分のみを考慮する積極的なアプローチが採用されており、依然として重要な分類情報を保持している顕著ではないオブジェクト領域は無視されます。
この論文では、混合ガウス分布プロトタイプ (MGProto) と呼ばれる、プロトタイプ分布を学習するための新しい生成パラダイムを紹介します。
MGProto からのプロトタイプの配布により、解釈可能な画像分類と OoD 入力の信頼できる認識の両方が可能になります。
MGProto の最適化により、学習されたプロトタイプ分布がトレーニング画像空間に自然に投影され、それによってプロトタイプの投影によって引き起こされるパフォーマンスの低下に対処します。
さらに、最もアクティブなオブジェクト部分だけでなく、顕著ではないオブジェクト部分も考慮した、斬新で効果的なプロトタイプマイニング戦略を開発します。
モデルのコンパクト性を促進するために、重要度の低い事前分布を持つプロトタイプを削除することによって MGProto をプルーニングすることをさらに提案します。
CUB-200-2011、Stanford Cars、Stanford Dogs、および Oxford-IIIIT Pets データセットでの実験では、MGProto が最先端の画像認識と OoD 検出パフォーマンスを実現すると同時に、有望な解釈可能性の結果を提供することが示されています。

要約(オリジナル)

Prototypical-part methods, e.g., ProtoPNet, enhance interpretability in image recognition by linking predictions to training prototypes, thereby offering intuitive insights into their decision-making. Existing methods, which rely on a point-based learning of prototypes, typically face two critical issues: 1) the learned prototypes have limited representation power and are not suitable to detect Out-of-Distribution (OoD) inputs, reducing their decision trustworthiness; and 2) the necessary projection of the learned prototypes back into the space of training images causes a drastic degradation in the predictive performance. Furthermore, current prototype learning adopts an aggressive approach that considers only the most active object parts during training, while overlooking sub-salient object regions which still hold crucial classification information. In this paper, we present a new generative paradigm to learn prototype distributions, termed as Mixture of Gaussian-distributed Prototypes (MGProto). The distribution of prototypes from MGProto enables both interpretable image classification and trustworthy recognition of OoD inputs. The optimisation of MGProto naturally projects the learned prototype distributions back into the training image space, thereby addressing the performance degradation caused by prototype projection. Additionally, we develop a novel and effective prototype mining strategy that considers not only the most active but also sub-salient object parts. To promote model compactness, we further propose to prune MGProto by removing prototypes with low importance priors. Experiments on CUB-200-2011, Stanford Cars, Stanford Dogs, and Oxford-IIIT Pets datasets show that MGProto achieves state-of-the-art image recognition and OoD detection performances, while providing encouraging interpretability results.

arxiv情報

著者	Chong Wang,Yuanhong Chen,Fengbei Liu,Yuyuan Liu,Davis James McCarthy,Helen Frazer,Gustavo Carneiro
発行日	2024-06-05 17:03:02+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

Mixture of Gaussian-distributed Prototypes with Generative Modelling for Interpretable and Trustworthy Image Recognition

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー