Adaptive Perturbation for Adversarial Attack

要約

近年、敵対的な例に対して脆弱なニューラルネットワークの急速な発展に伴い、ディープラーニングモデルのセキュリティがますます注目されています。
ほとんどすべての既存の勾配ベースの攻撃方法は、生成に符号関数を使用して、$L_\infty$ ノルムの摂動バジェットの要件を満たしています。
ただし、符号関数は正確な勾配方向を変更するため、敵対的な例を生成するには不適切である可能性があることがわかりました。
符号関数を使用する代わりに、敵対的摂動を生成するためのスケーリング係数を使用して正確な勾配方向を直接利用することを提案します。これにより、摂動が少なくても敵対的例の攻撃成功率が向上します。
同時に、この方法がより優れたブラックボックス転送可能性を達成できることも理論的に証明しています。
さらに、最適なスケーリング係数は画像ごとに異なることを考慮して、各画像に適切なスケーリング係数を求める適応スケーリング係数ジェネレーターを提案します。これにより、スケーリング係数を手動で検索するための計算コストが回避されます。
私たちの方法は、ほとんどすべての既存の勾配ベースの攻撃方法と統合して、攻撃の成功率をさらに向上させることができます。
CIFAR10 および ImageNet データセットに関する広範な実験は、私たちの方法がより高い転送性を示し、最先端の方法よりも優れていることを示しています。

要約(オリジナル)

In recent years, the security of deep learning models achieves more and more attentions with the rapid development of neural networks, which are vulnerable to adversarial examples. Almost all existing gradient-based attack methods use the sign function in the generation to meet the requirement of perturbation budget on $L_\infty$ norm. However, we find that the sign function may be improper for generating adversarial examples since it modifies the exact gradient direction. Instead of using the sign function, we propose to directly utilize the exact gradient direction with a scaling factor for generating adversarial perturbations, which improves the attack success rates of adversarial examples even with fewer perturbations. At the same time, we also theoretically prove that this method can achieve better black-box transferability. Moreover, considering that the best scaling factor varies across different images, we propose an adaptive scaling factor generator to seek an appropriate scaling factor for each image, which avoids the computational cost for manually searching the scaling factor. Our method can be integrated with almost all existing gradient-based attack methods to further improve their attack success rates. Extensive experiments on the CIFAR10 and ImageNet datasets show that our method exhibits higher transferability and outperforms the state-of-the-art methods.

arxiv情報

著者	Zheng Yuan,Jie Zhang,Zhaoyan Jiang,Liangliang Li,Shiguang Shan
発行日	2023-01-02 13:04:00+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

Adaptive Perturbation for Adversarial Attack

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー