Unlearnable Examples Detection via Iterative Filtering

要約

ディープニューラルネットワークは、データポイズニング攻撃に対して脆弱であることが証明されています。
最近、可用性攻撃として知られる特定のタイプのデータポイズニング攻撃により、画像に知覚できない摂動が追加されるため、モデル学習のためのデータ利用が失敗します。
したがって、混合データセットから学習不能サンプル (UE) とも呼ばれる汚染されたサンプルを検出することは、非常に有益かつ困難です。
これに応じて、UE を識別するための反復フィルタリングアプローチを提案します。
この方法では、追加情報を必要とせずに、固有のセマンティックマッピングルールとショートカットの区別を利用します。
UE とクリーンデータの両方を含む混合データセットで分類器をトレーニングすると、クリーンデータと比較してモデルが UE に迅速に適応する傾向があることを確認します。
きれいなサンプルと汚染されたサンプルを使用したトレーニング間の精度の差のため、汚染されたサンプルを正しく識別しながら、きれいなサンプルを誤って分類するモデルを採用しています。
追加のクラスの組み込みと反復改良により、クリーンなサンプルと汚染されたサンプルを区別するモデルの能力が強化されます。
広範な実験により、さまざまな攻撃、データセット、ポイズン比にわたって最先端の検出アプローチよりも当社の手法が優れていることが実証され、既存の手法と比較して総エラー率の半分 (HTER) が大幅に減少します。

要約(オリジナル)

Deep neural networks are proven to be vulnerable to data poisoning attacks. Recently, a specific type of data poisoning attack known as availability attacks has led to the failure of data utilization for model learning by adding imperceptible perturbations to images. Consequently, it is quite beneficial and challenging to detect poisoned samples, also known as Unlearnable Examples (UEs), from a mixed dataset. In response, we propose an Iterative Filtering approach for UEs identification. This method leverages the distinction between the inherent semantic mapping rules and shortcuts, without the need for any additional information. We verify that when training a classifier on a mixed dataset containing both UEs and clean data, the model tends to quickly adapt to the UEs compared to the clean data. Due to the accuracy gaps between training with clean/poisoned samples, we employ a model to misclassify clean samples while correctly identifying the poisoned ones. The incorporation of additional classes and iterative refinement enhances the model’s ability to differentiate between clean and poisoned samples. Extensive experiments demonstrate the superiority of our method over state-of-the-art detection approaches across various attacks, datasets, and poison ratios, significantly reducing the Half Total Error Rate (HTER) compared to existing methods.

arxiv情報

著者	Yi Yu,Qichen Zheng,Siyuan Yang,Wenhan Yang,Jun Liu,Shijian Lu,Yap-Peng Tan,Kwok-Yan Lam,Alex Kot
発行日	2024-08-15 13:26:13+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

Unlearnable Examples Detection via Iterative Filtering

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー