HINT: Healthy Influential-Noise based Training to Defend against Data Poisoning Attacks

要約

信頼できないデータソースからのポイズニング攻撃の可能性を阻止するために数多くの防御方法が提案されていますが、ほとんどの研究成果は特定の攻撃のみを防御するものであり、攻撃者が悪用する多くの手段が残されています。
この研究では、「Healthy Influential-Noise based Training」という、影響関数に基づいてデータポイズニング攻撃を防御するための効率的かつ堅牢なトレーニングアプローチを提案します。
影響関数を使用して、テストデータの汎化能力に大きな影響を与えることなく、ポイズニング攻撃に対する分類モデルを強化するのに役立つ健全なノイズを作成します。
さらに、私たちの方法は、これまでのいくつかの研究で使用されてきたすべての例にノイズを追加する現在の方法ではなく、トレーニングデータのサブセットのみが変更される場合でも効果的に実行できます。
さまざまな現実的な攻撃シナリオの下で、最先端のポイズニング攻撃による 2 つの画像データセットに対する包括的な評価を実施します。
私たちの実証結果は、HINT が非標的型攻撃と標的型ポイズニング攻撃の両方の影響から深層学習モデルを効率的に保護できることを示しています。

要約(オリジナル)

While numerous defense methods have been proposed to prohibit potential poisoning attacks from untrusted data sources, most research works only defend against specific attacks, which leaves many avenues for an adversary to exploit. In this work, we propose an efficient and robust training approach to defend against data poisoning attacks based on influence functions, named Healthy Influential-Noise based Training. Using influence functions, we craft healthy noise that helps to harden the classification model against poisoning attacks without significantly affecting the generalization ability on test data. In addition, our method can perform effectively when only a subset of the training data is modified, instead of the current method of adding noise to all examples that has been used in several previous works. We conduct comprehensive evaluations over two image datasets with state-of-the-art poisoning attacks under different realistic attack scenarios. Our empirical results show that HINT can efficiently protect deep learning models against the effect of both untargeted and targeted poisoning attacks.

arxiv情報

著者	Minh-Hao Van,Alycia N. Carey,Xintao Wu
発行日	2023-09-15 17:12:19+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

HINT: Healthy Influential-Noise based Training to Defend against Data Poisoning Attacks

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー