Wicked Oddities: Selectively Poisoning for Effective Clean-Label Backdoor Attacks

要約

ディープニューラルネットワークは、バックドア攻撃に対して脆弱です。バックドア攻撃は、トレーニングデータを汚染して、そのデータでトレーニングされたモデルの動作を操作する一種の敵対的攻撃です。
クリーンラベル攻撃は、汚染されたデータのラベルを変更せずに攻撃を実行できる、よりステルスな形式のバックドア攻撃です。
クリーンラベル攻撃に関する初期の研究では、サンプルが攻撃の成功に不均等に寄与するという事実を無視して、トレーニングセットのランダムなサブセットにトリガーを追加しました。
その結果、中毒率が高く、攻撃成功率が低くなります。
この問題を軽減するために、教師あり学習ベースのサンプル選択戦略がいくつか提案されています。
ただし、これらの方法は、ラベル付きトレーニングセット全体へのアクセスを前提としており、トレーニングが必要です。これには費用がかかり、必ずしも実用的であるとは限りません。
この研究では、攻撃者がターゲットクラス (顔認識システムなど) のデータのみを提供し、被害者モデルやトレーニングセット内の他のクラスについての知識を持たない、新しくてより実用的な (しかしより困難でもある) 脅威モデルを研究します。
。
私たちは、この設定で攻撃の成功率を高めるために、ターゲットクラスのトレーニングサンプルの少数のセットを選択的に毒するためのさまざまな戦略を研究しています。
私たちの脅威モデルは、限られた情報で効果的に攻撃を実行できるため、サードパーティのデータセットを使用して機械学習モデルをトレーニングする場合に深刻な脅威をもたらします。
ベンチマークデータセットの実験は、クリーンラベルバックドア攻撃を改善する上での戦略の有効性を示しています。

要約(オリジナル)

Deep neural networks are vulnerable to backdoor attacks, a type of adversarial attack that poisons the training data to manipulate the behavior of models trained on such data. Clean-label attacks are a more stealthy form of backdoor attacks that can perform the attack without changing the labels of poisoned data. Early works on clean-label attacks added triggers to a random subset of the training set, ignoring the fact that samples contribute unequally to the attack’s success. This results in high poisoning rates and low attack success rates. To alleviate the problem, several supervised learning-based sample selection strategies have been proposed. However, these methods assume access to the entire labeled training set and require training, which is expensive and may not always be practical. This work studies a new and more practical (but also more challenging) threat model where the attacker only provides data for the target class (e.g., in face recognition systems) and has no knowledge of the victim model or any other classes in the training set. We study different strategies for selectively poisoning a small set of training samples in the target class to boost the attack success rate in this setting. Our threat model poses a serious threat in training machine learning models with third-party datasets, since the attack can be performed effectively with limited information. Experiments on benchmark datasets illustrate the effectiveness of our strategies in improving clean-label backdoor attacks.

arxiv情報

著者	Quang H. Nguyen,Nguyen Ngoc-Hieu,The-Anh Ta,Thanh Nguyen-Tang,Kok-Seng Wong,Hoang Thanh-Tung,Khoa D. Doan
発行日	2024-07-16 04:21:12+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

Wicked Oddities: Selectively Poisoning for Effective Clean-Label Backdoor Attacks

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー