Towards Backdoor Stealthiness in Model Parameter Space

要約

バックドアのステルス性に関する最近の研究は、主に入力空間の区別できないトリガーと特徴空間の分離できないバックドア表現に焦点を当てており、これらのそれぞれの空間を検査するバックドア防御を回避することを目的としています。
ただし、既存のバックドア攻撃は通常、多様な防御メカニズムを考慮せずに、特定の種類のバックドア防御に対抗するように設計されています。
この観察に基づいて、私たちは自然な疑問を投げかけます。多様な実際的な防御に直面した場合、現在のバックドア攻撃は本当に現実世界の脅威なのでしょうか?
この質問に答えるために、入力空間または機能空間のステルス性に焦点を当てた 12 の一般的なバックドア攻撃と 17 の多様な代表的な防御を調べます。
驚くべきことに、重大な盲点が明らかになりました。入力空間と特徴空間でステルスになるように設計されたバックドア攻撃は、パラメーター空間でバックドアモデルを調べることで軽減できます。
この一般的な脆弱性の背後にある根本的な原因を調査するために、パラメーター空間でのバックドア攻撃の特徴を研究します。
特に、入力空間および特徴空間の攻撃により、パラメータ空間に顕著なバックドア関連のニューロンが導入されることがわかりましたが、これらのニューロンは現在のバックドア攻撃では十分に考慮されていません。
総合的なステルス性を考慮して、Grond と呼ばれる新しいサプライチェーン攻撃を提案します。
Grond は、シンプルかつ効果的なモジュールである Adversarial Backdoor Injection (ABI) によってパラメーターの変更を制限します。これは、バックドアインジェクション中のパラメーター空間のステルス性を適応的に高めます。
広範な実験により、CIFAR-10、GTSRB、および ImageNet のサブセット上の最先端の (適応型を含む) 防御に対して、Grond が 12 のバックドア攻撃すべてを上回るパフォーマンスを発揮することが実証されました。
さらに、ABI が一般的なバックドア攻撃の有効性を一貫して向上させることを示します。

要約(オリジナル)

Recent research on backdoor stealthiness focuses mainly on indistinguishable triggers in input space and inseparable backdoor representations in feature space, aiming to circumvent backdoor defenses that examine these respective spaces. However, existing backdoor attacks are typically designed to resist a specific type of backdoor defense without considering the diverse range of defense mechanisms. Based on this observation, we pose a natural question: Are current backdoor attacks truly a real-world threat when facing diverse practical defenses? To answer this question, we examine 12 common backdoor attacks that focus on input-space or feature-space stealthiness and 17 diverse representative defenses. Surprisingly, we reveal a critical blind spot: Backdoor attacks designed to be stealthy in input and feature spaces can be mitigated by examining backdoored models in parameter space. To investigate the underlying causes behind this common vulnerability, we study the characteristics of backdoor attacks in the parameter space. Notably, we find that input- and feature-space attacks introduce prominent backdoor-related neurons in parameter space, which are not thoroughly considered by current backdoor attacks. Taking comprehensive stealthiness into account, we propose a novel supply-chain attack called Grond. Grond limits the parameter changes by a simple yet effective module, Adversarial Backdoor Injection (ABI), which adaptively increases the parameter-space stealthiness during the backdoor injection. Extensive experiments demonstrate that Grond outperforms all 12 backdoor attacks against state-of-the-art (including adaptive) defenses on CIFAR-10, GTSRB, and a subset of ImageNet. In addition, we show that ABI consistently improves the effectiveness of common backdoor attacks.

arxiv情報

著者	Xiaoyun Xu,Zhuoran Liu,Stefanos Koffas,Stjepan Picek
発行日	2025-01-10 12:49:12+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

Towards Backdoor Stealthiness in Model Parameter Space

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー