Saliency-Bench: A Comprehensive Benchmark for Evaluating Visual Explanations

要約

説明可能なAI（XAI）は、特に画像分類タスクにおいて、顕著性マップによって視覚化された説明を通じて、深層学習モデルの意思決定プロセスに対する洞察を提供することで大きな注目を集めている。その成功にもかかわらず、注釈付きデータセットや標準化された評価パイプラインの不足による課題が残っている。本論文では、複数のデータセットにおいて、顕著性手法によって生成された視覚的説明を評価するために設計された新しいベンチマークスイートであるSaliency-Benchを紹介する。我々は、シーン分類、がん診断、物体分類、行動分類などの多様なタスクをカバーする8つのデータセットと、それに対応するグランドトゥルースの説明をキュレーション、構築、アノテーションした。このベンチマークには、視覚的説明の忠実性と整合性を評価するための標準化された統一的な評価パイプラインが含まれており、全体的な視覚的説明の性能評価を提供する。これらの8つのデータセットを、異なる画像分類器アーキテクチャ上で広く使用されている顕著性手法とベンチマークし、説明の品質を評価する。さらに、データアクセス、データロードから結果評価までの評価パイプラインを自動化するために、使いやすいAPIを開発した。ベンチマークは我々のウェブサイトhttps://xaidataset.github.io。

要約(オリジナル)

Explainable AI (XAI) has gained significant attention for providing insights into the decision-making processes of deep learning models, particularly for image classification tasks through visual explanations visualized by saliency maps. Despite their success, challenges remain due to the lack of annotated datasets and standardized evaluation pipelines. In this paper, we introduce Saliency-Bench, a novel benchmark suite designed to evaluate visual explanations generated by saliency methods across multiple datasets. We curated, constructed, and annotated eight datasets, each covering diverse tasks such as scene classification, cancer diagnosis, object classification, and action classification, with corresponding ground-truth explanations. The benchmark includes a standardized and unified evaluation pipeline for assessing faithfulness and alignment of the visual explanation, providing a holistic visual explanation performance assessment. We benchmark these eight datasets with widely used saliency methods on different image classifier architectures to evaluate explanation quality. Additionally, we developed an easy-to-use API for automating the evaluation pipeline, from data accessing, and data loading, to result evaluation. The benchmark is available via our website: https://xaidataset.github.io.

arxiv情報

著者	Yifei Zhang,James Song,Siyi Gu,Tianxu Jiang,Bo Pan,Guangji Bai,Liang Zhao
発行日	2025-03-03 09:26:26+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, DeepL

Saliency-Bench: A Comprehensive Benchmark for Evaluating Visual Explanations

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー