Dispersed Pixel Perturbation-based Imperceptible Backdoor Trigger for Image Classifier Models

要約

典型的なディープニューラルネットワーク (DNN) バックドア攻撃は、入力に埋め込まれたトリガーに基づいています。
既存の知覚できないトリガーは、計算コストが高いか、攻撃の成功率が低いです。
このホワイトペーパーでは、簡単に生成でき、感知されず、非常に効果的な新しいバックドアトリガーを提案します。
新しいトリガーは、均一にランダムに生成された 3 次元 (3D) バイナリパターンであり、バックドア DNN モデルをトレーニングするために、水平方向および/または垂直方向に繰り返し、3 チャネル画像にミラーリングおよび重ね合わせることができます。
画像全体に分散された新しいトリガーは、個々のピクセルに弱い摂動を生成しますが、DNN のバックドアをトレーニングしてアクティブ化するための強力な認識可能なパターンを集合的に保持します。
また、画像の解像度が向上するにつれて、トリガーがますます効果的であることも分析的に明らかにしています。
実験は、MNIST、CIFAR-10、および BTSR データセットで ResNet-18 および MLP モデルを使用して行われます。
感知されないという点では、新しいトリガーは、BadNets、Trojaned NN、Hidden Backdoor などの既存のトリガーよりも 1 桁以上優れています。
新しいトリガーは、ほぼ 100% の攻撃成功率を達成し、分類精度の低下は 0.7% ～ 2.4% 未満であり、最先端の防御技術を無効にします。

要約(オリジナル)

Typical deep neural network (DNN) backdoor attacks are based on triggers embedded in inputs. Existing imperceptible triggers are computationally expensive or low in attack success. In this paper, we propose a new backdoor trigger, which is easy to generate, imperceptible, and highly effective. The new trigger is a uniformly randomly generated three-dimensional (3D) binary pattern that can be horizontally and/or vertically repeated and mirrored and superposed onto three-channel images for training a backdoored DNN model. Dispersed throughout an image, the new trigger produces weak perturbation to individual pixels, but collectively holds a strong recognizable pattern to train and activate the backdoor of the DNN. We also analytically reveal that the trigger is increasingly effective with the improving resolution of the images. Experiments are conducted using the ResNet-18 and MLP models on the MNIST, CIFAR-10, and BTSR datasets. In terms of imperceptibility, the new trigger outperforms existing triggers, such as BadNets, Trojaned NN, and Hidden Backdoor, by over an order of magnitude. The new trigger achieves an almost 100% attack success rate, only reduces the classification accuracy by less than 0.7%-2.4%, and invalidates the state-of-the-art defense techniques.

arxiv情報

著者	Yulong Wang,Minghui Zhao,Shenghong Li,Xin Yuan,Wei Ni
発行日	2022-08-19 13:33:12+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

Dispersed Pixel Perturbation-based Imperceptible Backdoor Trigger for Image Classifier Models

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー