A Vulnerability of Attribution Methods Using Pre-Softmax Scores

要約

分類子として機能する畳み込みニューラルネットワークの出力の説明を提供するために使用される属性メソッドのカテゴリに関連する脆弱性について説明します。
このタイプのネットワークは敵対的攻撃に対して脆弱であることが知られており、入力の知覚できない摂動によってモデルの出力が変更される可能性があります。
対照的に、ここでは、モデルの出力を変更せずに、モデル内の小さな変更がアトリビューション方法に引き起こす可能性がある影響に焦点を当てます。

要約(オリジナル)

We discuss a vulnerability involving a category of attribution methods used to provide explanations for the outputs of convolutional neural networks working as classifiers. It is known that this type of networks are vulnerable to adversarial attacks, in which imperceptible perturbations of the input may alter the outputs of the model. In contrast, here we focus on effects that small modifications in the model may cause on the attribution method without altering the model outputs.

arxiv情報

著者	Miguel Lerma,Mirtha Lucas
発行日	2023-10-25 16:35:19+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

A Vulnerability of Attribution Methods Using Pre-Softmax Scores

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー