Analyzing Explainer Robustness via Probabilistic Lipschitzness of Prediction Functions

要約

機械学習手法の予測能力は大幅に向上しましたが、同時に、より複雑になり、透明性が低下しています。
その結果、これらのブラックボックス予測モデルに解釈可能性を提供するために説明者に依存することがよくあります。
重要な診断ツールとして、これらの説明者自体が堅牢であることが重要です。
この論文では、堅牢性の 1 つの特定の側面、つまり、説明者は同様のデータ入力に対して同様の説明を提供する必要があることに焦点を当てます。
予測関数の鋭敏さと同様に、説明者の鋭敏さを導入および定義することによって、この概念を形式化します。
私たちの形式主義により、説明子の堅牢性を、関数の局所的な滑らかさの確率を捉える予測子の確率的リプシッツネスに結び付けることができます。
予測関数のリプシッツ性を考慮して、さまざまな説明者 (SHAP、RISE、CXPlain など) の洞察力について下限保証を提供します。
これらの理論的結果は、局所的に滑らかな予測関数が局所的に堅牢な説明に役立つことを示唆しています。
これらの結果は、実際のデータセットだけでなくシミュレーションされたデータセットでも経験的に評価されます。

要約(オリジナル)

Machine learning methods have significantly improved in their predictive capabilities, but at the same time they are becoming more complex and less transparent. As a result, explainers are often relied on to provide interpretability to these black-box prediction models. As crucial diagnostics tools, it is important that these explainers themselves are robust. In this paper we focus on one particular aspect of robustness, namely that an explainer should give similar explanations for similar data inputs. We formalize this notion by introducing and defining explainer astuteness, analogous to astuteness of prediction functions. Our formalism allows us to connect explainer robustness to the predictor’s probabilistic Lipschitzness, which captures the probability of local smoothness of a function. We provide lower bound guarantees on the astuteness of a variety of explainers (e.g., SHAP, RISE, CXPlain) given the Lipschitzness of the prediction function. These theoretical results imply that locally smooth prediction functions lend themselves to locally robust explanations. We evaluate these results empirically on simulated as well as real datasets.

arxiv情報

著者	Zulqarnain Khan,Davin Hill,Aria Masoomi,Joshua Bone,Jennifer Dy
発行日	2024-04-16 16:27:15+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

Analyzing Explainer Robustness via Probabilistic Lipschitzness of Prediction Functions

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー