CLIP is Strong Enough to Fight Back: Test-time Counterattacks towards Zero-shot Adversarial Robustness of CLIP

要約

画像テキストマッチングタスクでの一般的な使用は、ゼロショットの方法で使用されていますが、クリップは画像に追加された敵対的な摂動に対して非常に脆弱であることが示されています。
最近の研究では、その場で生成された敵対的なサンプルでクリップのビジョンエンコーダーを微調整し、ゼロショットの堅牢性と呼ばれるプロパティの下流データセットのスペクトルに対する敵対的な攻撃に対する堅牢性の改善を示しています。
この論文では、分類損失を最大化しようとする悪意のある摂動は、「誤って安定した」画像につながり、クリップの事前に訓練されたビジョンエンコーダーを活用して、堅牢性を達成するためにそのような敵対的な画像を反撃することを提案することを示します。
私たちのパラダイムはシンプルでトレーニングなしで、テスト時に敵対的な攻撃からクリップを守る最初の方法を提供します。これは、クリップのゼロショット敵対的堅牢性を高めることを目的とした既存の方法に直交します。
16の分類データセットで実験を実施し、クリーン画像のパフォーマンスを顕著に損なうことなく、外部ネットワークに依存しない既存の敵対的な堅牢性研究から適応したテスト時間防御方法と比較して、安定した一貫したゲインを実証します。
また、私たちのパラダイムは、テスト時に堅牢性をさらに高めるために敵対的に微調整されたクリップモデルに使用できることを示しています。
私たちのコードは\ href {https://github.com/sxing2/clip-test-time-counterAttacks} {ここで}利用可能です。

要約(オリジナル)

Despite its prevalent use in image-text matching tasks in a zero-shot manner, CLIP has been shown to be highly vulnerable to adversarial perturbations added onto images. Recent studies propose to finetune the vision encoder of CLIP with adversarial samples generated on the fly, and show improved robustness against adversarial attacks on a spectrum of downstream datasets, a property termed as zero-shot robustness. In this paper, we show that malicious perturbations that seek to maximise the classification loss lead to `falsely stable’ images, and propose to leverage the pre-trained vision encoder of CLIP to counterattack such adversarial images during inference to achieve robustness. Our paradigm is simple and training-free, providing the first method to defend CLIP from adversarial attacks at test time, which is orthogonal to existing methods aiming to boost zero-shot adversarial robustness of CLIP. We conduct experiments across 16 classification datasets, and demonstrate stable and consistent gains compared to test-time defence methods adapted from existing adversarial robustness studies that do not rely on external networks, without noticeably impairing performance on clean images. We also show that our paradigm can be employed on CLIP models that have been adversarially finetuned to further enhance their robustness at test time. Our code is available \href{https://github.com/Sxing2/CLIP-Test-time-Counterattacks}{here}.

arxiv情報

著者	Songlong Xing,Zhengyu Zhao,Nicu Sebe
発行日	2025-03-05 15:51:59+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

CLIP is Strong Enough to Fight Back: Test-time Counterattacks towards Zero-shot Adversarial Robustness of CLIP

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー