Better Explain Transformers by Illuminating Important Information

要約

Transformer ベースのモデルは、さまざまな自然言語処理 (NLP) タスクに優れており、内部の仕組みを説明するための無数の取り組みが注目されています。
従来の方法では、生の勾配とトークン属性スコアとしての注意に焦点を当ててトランスフォーマーを説明していましたが、説明の計算中に無関係な情報が考慮されることが多く、結果が混乱します。
この研究では、レイヤーごとの関連性伝播 (LRP) 手法に基づく洗練された情報フローによって、重要な情報を強調表示し、無関係な情報を削除することを提案します。
具体的には、構文上の見出しと位置上の見出しを重要な注目の見出しとして特定し、これらの重要な見出しから得られる関連性に焦点を当てます。
実験結果は、無関係な情報は出力アトリビューションスコアを歪めるため、説明の計算中にマスクする必要があることを示しています。
分類データセットと質問応答データセットの両方に関する 8 つのベースラインと比較して、私たちの方法は説明メトリクスで 3\% ～ 33\% を超える改善を示し、優れた説明パフォーマンスを提供します。
匿名コードリポジトリは、https://github.com/LinxinS97/Mask-LRP から入手できます。

要約(オリジナル)

Transformer-based models excel in various natural language processing (NLP) tasks, attracting countless efforts to explain their inner workings. Prior methods explain Transformers by focusing on the raw gradient and attention as token attribution scores, where non-relevant information is often considered during explanation computation, resulting in confusing results. In this work, we propose highlighting the important information and eliminating irrelevant information by a refined information flow on top of the layer-wise relevance propagation (LRP) method. Specifically, we consider identifying syntactic and positional heads as important attention heads and focus on the relevance obtained from these important heads. Experimental results demonstrate that irrelevant information does distort output attribution scores and then should be masked during explanation computation. Compared to eight baselines on both classification and question-answering datasets, our method consistently outperforms with over 3\% to 33\% improvement on explanation metrics, providing superior explanation performance. Our anonymous code repository is available at: https://github.com/LinxinS97/Mask-LRP

arxiv情報

著者	Linxin Song,Yan Cui,Ao Luo,Freddy Lecue,Irene Li
発行日	2024-01-18 13:41:08+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

Better Explain Transformers by Illuminating Important Information

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー