Explain Yourself, Briefly! Self-Explaining Neural Networks with Concise Sufficient Reasons

要約

十分な十分な理由は、説明の一般的な形式を表しています。これは、対応する値に一定に保持されている場合、予測が変更されないことを確認する入力機能の最小サブセットです。
以前の事後の方法は、そのような説明を取得しようとしますが、2つの主な制限に直面します。（1）これらのサブセットを取得することは計算上の課題をもたらし、最もスケーラブルな方法を最適でない意味のないサブセットに収束させます。
（2）これらのメソッドは、分散不足の入力割り当てのサンプリングに大きく依存しており、潜在的に直感に反する動作をもたらします。
これらの制限に取り組むために、この作業では、自己教師のトレーニングアプローチを提案します。これは *十分なサブセットトレーニング *（SST）と呼ばれます。
SSTを使用して、モデルをトレーニングして、出力の不可欠な部分として予測の簡潔な十分な理由を生成します。
私たちの結果は、私たちのフレームワークが、競合する事後の方法よりも簡潔で忠実なサブセットが大幅に効率的に生成され、同等の予測パフォーマンスを維持することを示しています。

要約(オリジナル)

Minimal sufficient reasons represent a prevalent form of explanation – the smallest subset of input features which, when held constant at their corresponding values, ensure that the prediction remains unchanged. Previous post-hoc methods attempt to obtain such explanations but face two main limitations: (1) Obtaining these subsets poses a computational challenge, leading most scalable methods to converge towards suboptimal, less meaningful subsets; (2) These methods heavily rely on sampling out-of-distribution input assignments, potentially resulting in counterintuitive behaviors. To tackle these limitations, we propose in this work a self-supervised training approach, which we term *sufficient subset training* (SST). Using SST, we train models to generate concise sufficient reasons for their predictions as an integral part of their output. Our results indicate that our framework produces succinct and faithful subsets substantially more efficiently than competing post-hoc methods, while maintaining comparable predictive performance.

arxiv情報

著者	Shahaf Bassan,Shlomit Gur,Ron Eliav
発行日	2025-02-05 17:29:12+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

Explain Yourself, Briefly! Self-Explaining Neural Networks with Concise Sufficient Reasons

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー