Improving Adversarial Transferability on Vision Transformers via Forward Propagation Refinement

要約

ビジョントランス（VIT）は、さまざまなコンピュータービジョンおよびビジョン言語タスクに広く適用されています。
実際のシナリオでの堅牢性に関する洞察を得るために、vitsの移転可能な敵対例が広範囲に研究されています。
敵対的な移動性を改善するための典型的なアプローチは、代理モデルを改善することです。
ただし、VITに関する既存の作業により、代理洗練が後方伝播に制限されています。
この作業では、代わりに前方伝播の洗練（FPR）に焦点を当て、注意マップとトークンの埋め込みという2つの重要なモジュールを具体的に改善します。
注意マップについては、注意マップの多様化（AMD）を提案します。これは、特定の注意マップを多様化し、背面伝播中に有益な勾配消失を暗黙的に課します。
トークンの埋め込みについては、Momentum Token Embedding（MTE）を提案します。これは、履歴トークンの埋め込みを蓄積して、注意ブロックとMLPブロックの両方の前方更新を安定させます。
私たちは、VITからさまざまなCNNSおよびVITに転送された敵対的な例を使用して広範な実験を行い、FPRが現在の最高の（後方）代理洗練を平均で最大7.0 \％上回ることを示しています。
また、人気のある防衛に対する優位性と、他の転送方法との互換性を検証します。
コードと付録は、https：//github.com/ryc-98/fprで入手できます。

要約(オリジナル)

Vision Transformers (ViTs) have been widely applied in various computer vision and vision-language tasks. To gain insights into their robustness in practical scenarios, transferable adversarial examples on ViTs have been extensively studied. A typical approach to improving adversarial transferability is by refining the surrogate model. However, existing work on ViTs has restricted their surrogate refinement to backward propagation. In this work, we instead focus on Forward Propagation Refinement (FPR) and specifically refine two key modules of ViTs: attention maps and token embeddings. For attention maps, we propose Attention Map Diversification (AMD), which diversifies certain attention maps and also implicitly imposes beneficial gradient vanishing during backward propagation. For token embeddings, we propose Momentum Token Embedding (MTE), which accumulates historical token embeddings to stabilize the forward updates in both the Attention and MLP blocks. We conduct extensive experiments with adversarial examples transferred from ViTs to various CNNs and ViTs, demonstrating that our FPR outperforms the current best (backward) surrogate refinement by up to 7.0\% on average. We also validate its superiority against popular defenses and its compatibility with other transfer methods. Codes and appendix are available at https://github.com/RYC-98/FPR.

arxiv情報

著者	Yuchen Ren,Zhengyu Zhao,Chenhao Lin,Bo Yang,Lu Zhou,Zhe Liu,Chao Shen
発行日	2025-03-19 16:44:23+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

Improving Adversarial Transferability on Vision Transformers via Forward Propagation Refinement

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー