Post-hoc Interpretability for Neural NLP: A Survey

要約

NLP 用のニューラルネットワークはますます複雑になり、広く普及しており、これらのモデルが使用できるかどうかという懸念が高まっています。
モデルを説明することは、安全性と倫理的懸念に対処するのに役立ち、説明責任を負うために不可欠です。
解釈可能性は、人間が理解できる言葉でこれらの説明を提供するのに役立ちます。
さらに、事後メソッドはモデルが学習された後に説明を提供し、一般にモデルに依存しません。
この調査では、最近の事後解釈可能性手法がどのように人間に説明を伝えるのかを分類し、各手法について詳しく説明し、後者は一般的な懸念事項であるため、どのように検証されるかについて説明します。

要約(オリジナル)

Neural networks for NLP are becoming increasingly complex and widespread, and there is a growing concern if these models are responsible to use. Explaining models helps to address the safety and ethical concerns and is essential for accountability. Interpretability serves to provide these explanations in terms that are understandable to humans. Additionally, post-hoc methods provide explanations after a model is learned and are generally model-agnostic. This survey provides a categorization of how recent post-hoc interpretability methods communicate explanations to humans, it discusses each method in-depth, and how they are validated, as the latter is often a common concern.

arxiv情報

著者	Andreas Madsen,Siva Reddy,Sarath Chandar
発行日	2023-11-28 06:39:41+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

Post-hoc Interpretability for Neural NLP: A Survey

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー