Robust Backdoor Attack with Visible, Semantic, Sample-Specific, and Compatible Triggers


ディープ ニューラル ネットワーク (DNN) は、通常のサンプルでのパフォーマンスに影響を与えることなく、特定のトリガー パターンにさらされたときに特定の動作を示すように操作できます。
この制限に対処するために、我々は、安定拡散モデルとして知られる最近の強力な画像手法を利用する、可視、セマンティック、サンプル固有、および互換性トリガー (VSSC トリガー) と呼ばれる新しいアプローチを提案します。
このアプローチでは、テキスト トリガーがプロンプトとして利用され、無害な画像と組み合わせられます。
結果として得られる組み合わせは、事前にトレーニングされた安定した拡散モデルによって処理され、対応するセマンティック オブジェクトが生成されます。


Deep neural networks (DNNs) can be manipulated to exhibit specific behaviors when exposed to specific trigger patterns, without affecting their performance on normal samples. This type of attack is known as a backdoor attack. Recent research has focused on designing invisible triggers for backdoor attacks to ensure visual stealthiness. These triggers have demonstrated strong attack performance even under backdoor defense, which aims to eliminate or suppress the backdoor effect in the model. However, through experimental observations, we have noticed that these carefully designed invisible triggers are often susceptible to visual distortion during inference, such as Gaussian blurring or environmental variations in real-world scenarios. This phenomenon significantly undermines the effectiveness of attacks in practical applications. Unfortunately, this issue has not received sufficient attention and has not been thoroughly investigated. To address this limitation, we propose a novel approach called the Visible, Semantic, Sample-Specific, and Compatible trigger (VSSC-trigger), which leverages a recent powerful image method known as the stable diffusion model. In this approach, a text trigger is utilized as a prompt and combined with a benign image. The resulting combination is then processed by a pre-trained stable diffusion model, generating a corresponding semantic object. This object is seamlessly integrated with the original image, resulting in a new realistic image, referred to as the poisoned image. Extensive experimental results and analysis validate the effectiveness and robustness of our proposed attack method, even in the presence of visual distortion. We believe that the new trigger proposed in this work, along with the proposed idea to address the aforementioned issues, will have significant prospective implications for further advancements in this direction.


著者 Ruotong Wang,Hongrui Chen,Zihao Zhu,Li Liu,Yong Zhang,Yanbo Fan,Baoyuan Wu
発行日 2023-06-01 15:42:06+00:00
arxivサイト arxiv_id(pdf)

提供元, 利用サービス, Google

カテゴリー: cs.CR, cs.CV パーマリンク