Disentangled Explanations of Neural Network Predictions by Finding Relevant Subspaces

要約

Explainable AI は、予測に対する説明を生成することで、ニューラルネットワークのような複雑な ML モデルのブラックボックスの性質を克服することを目的としています。
説明は多くの場合、モデルの決定に関連する入力特徴 (ピクセルなど) を特定するヒートマップの形式をとります。
しかし、これらの説明は、全体的な複雑な意思決定戦略に関与する可能性のある複数の要因が絡み合っています。
我々は、ニューラルネットワークの中間層で、予測に関連する複数の異なる活性化パターン（視覚概念など）を捕捉する部分空間を抽出することによって、説明のもつれを解くことを提案します。
これらの部分空間を自動的に抽出するために、PCA または ICA で見つかった原理を拡張して説明する 2 つの新しい分析を提案します。
これらの新しい分析は、主関連成分分析 (PRCA) および関連部分空間分析 (DRSA) と呼ばれており、たとえば、関連性の代わりに関連性を最大化します。
分散または尖度。
これにより、モデルが不変であるアクティベーションや概念を無視して、ML モデルが実際に予測に使用するものに重点を置いて分析することが可能になります。
私たちのアプローチは、Shapley Value、Integrated Gradients、LRP などの一般的なアトリビューション手法と併用できるほど汎用的です。
私たちが提案した方法は実用的に有用であり、ベンチマークと 3 つのユースケースで実証されているように、最先端技術と同等であることがわかります。

要約(オリジナル)

Explainable AI aims to overcome the black-box nature of complex ML models like neural networks by generating explanations for their predictions. Explanations often take the form of a heatmap identifying input features (e.g. pixels) that are relevant to the model’s decision. These explanations, however, entangle the potentially multiple factors that enter into the overall complex decision strategy. We propose to disentangle explanations by extracting at some intermediate layer of a neural network, subspaces that capture the multiple and distinct activation patterns (e.g. visual concepts) that are relevant to the prediction. To automatically extract these subspaces, we propose two new analyses, extending principles found in PCA or ICA to explanations. These novel analyses, which we call principal relevant component analysis (PRCA) and disentangled relevant subspace analysis (DRSA), maximize relevance instead of e.g. variance or kurtosis. This allows for a much stronger focus of the analysis on what the ML model actually uses for predicting, ignoring activations or concepts to which the model is invariant. Our approach is general enough to work alongside common attribution techniques such as Shapley Value, Integrated Gradients, or LRP. Our proposed methods show to be practically useful and compare favorably to the state of the art as demonstrated on benchmarks and three use cases.

arxiv情報

著者	Pattarawat Chormai,Jan Herrmann,Klaus-Robert Müller,Grégoire Montavon
発行日	2024-04-15 08:24:42+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

Disentangled Explanations of Neural Network Predictions by Finding Relevant Subspaces

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー