What I Cannot Predict, I Do Not Understand: A Human-Centered Evaluation Framework for Explainability Methods

要約

最新のAIシステムがどのように意思決定を行うかをよりよく理解するために、多数の説明可能性の方法と関連する忠実度のパフォーマンスメトリックが提案されています。
ただし、現在の作業の多くは理論的なままであり、人間のエンドユーザーをあまり考慮していません。
特に、（1）より現実的なシナリオで現在の説明可能性の方法が実際にどれほど有用であるか、および（2）関連するパフォーマンスメトリックが、個々の説明が人間のエンドユーザーにどの程度の知識を提供するかを正確に予測するかどうかはまだわかっていません。
システムの内部動作を理解します。
このギャップを埋めるために、人間の参加者が代表的なアトリビューション手法を活用して、AIシステムのバイアスを特定し、視覚戦略を特徴付ける3つの現実世界のシナリオを表すさまざまな画像分類子の動作を理解する能力を評価するために、大規模な心理物理学実験を実施しました。
訓練を受けていない非専門家の人間の観察者にとって難しすぎるタスクの使用、およびその失敗事例の理解。
私たちの結果は、個々の帰属方法が人間の参加者がAIシステムをよりよく理解するのに役立つ程度が、これらのシナリオ全体で大きく異なることを示しています。
これは、現在のアトリビューション手法の定量的改善を超えて、人間のエンドユーザーに質的に異なる情報源を提供する補完的なアプローチの開発に向けて、この分野が決定的に必要であることを示唆しています。

要約(オリジナル)

A multitude of explainability methods and associated fidelity performance metrics have been proposed to help better understand how modern AI systems make decisions. However, much of the current work has remained theoretical — without much consideration for the human end-user. In particular, it is not yet known (1) how useful current explainability methods are in practice for more real-world scenarios and (2) how well associated performance metrics accurately predict how much knowledge individual explanations contribute to a human end-user trying to understand the inner-workings of the system. To fill this gap, we conducted psychophysics experiments at scale to evaluate the ability of human participants to leverage representative attribution methods for understanding the behavior of different image classifiers representing three real-world scenarios: identifying bias in an AI system, characterizing the visual strategy it uses for tasks that are too difficult for an untrained non-expert human observer as well as understanding its failure cases. Our results demonstrate that the degree to which individual attribution methods help human participants better understand an AI system varied widely across these scenarios. This suggests a critical need for the field to move past quantitative improvements of current attribution methods towards the development of complementary approaches that provide qualitatively different sources of information to human end-users.

arxiv情報

著者	Julien Colin,Thomas Fel,Remi Cadene,Thomas Serre
発行日	2022-06-03 14:41:39+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

What I Cannot Predict, I Do Not Understand: A Human-Centered Evaluation Framework for Explainability Methods

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー