Enhancing Fine-Tuning Based Backdoor Defense with Sharpness-Aware Minimization

要約

タイトル: Sharpness-Aware Minimization を用いたFine-Tuningベースの後付け防御の向上　
要約:
– バックドア攻撃に対する後付け防御は、機械学習のセキュリティと完全性にとってますます重要になっています。
– 無害なデータに基づくファインチューニングは、バックドア効果を消去するための自然な防御であるが、最近の研究は、限られた無害なデータがある場合、バニラのファインチューニングは防御性能が低いことを示している。
– ニューロンの観点からバックドア攻撃されたモデルのファインチューニングを深く研究し、バックドアに関連するニューロンがファインチューニングプロセスで局所的な最小値から脱出できないことがわかった。
– バックドアに関連したニューロンはしばしばより大きなノルムを持っていることを観察したことに着想を得て、ファインチューニングとシャープネスに注意した最小化を組み合わせたFTSAMという新しいバックドア防御パラダイムを提案する。
– 複数のベンチマークデータセットとネットワークアーキテクチャで、我々の方法の効果を実証し、最先端の防御性能を達成した。
– 全体的に、我々の研究は、バックドア攻撃に対する機械学習モデルの堅牢性を向上させるための有望なアプローチを提供する。

要約(オリジナル)

Backdoor defense, which aims to detect or mitigate the effect of malicious triggers introduced by attackers, is becoming increasingly critical for machine learning security and integrity. Fine-tuning based on benign data is a natural defense to erase the backdoor effect in a backdoored model. However, recent studies show that, given limited benign data, vanilla fine-tuning has poor defense performance. In this work, we provide a deep study of fine-tuning the backdoored model from the neuron perspective and find that backdoorrelated neurons fail to escape the local minimum in the fine-tuning process. Inspired by observing that the backdoorrelated neurons often have larger norms, we propose FTSAM, a novel backdoor defense paradigm that aims to shrink the norms of backdoor-related neurons by incorporating sharpness-aware minimization with fine-tuning. We demonstrate the effectiveness of our method on several benchmark datasets and network architectures, where it achieves state-of-the-art defense performance. Overall, our work provides a promising avenue for improving the robustness of machine learning models against backdoor attacks.

arxiv情報

著者	Mingli Zhu,Shaokui Wei,Li Shen,Yanbo Fan,Baoyuan Wu
発行日	2023-04-24 05:13:52+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, OpenAI

Enhancing Fine-Tuning Based Backdoor Defense with Sharpness-Aware Minimization

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー