Investigating Catastrophic Overfitting in Fast Adversarial Training: A Self-fitting Perspective

要約

高速な敵対的トレーニングは、堅牢なネットワークを構築するための効率的なアプローチを提供しますが、多段階の堅牢な精度が突然ゼロに崩壊する壊滅的なオーバーフィッティング (CO) として知られる深刻な問題に悩まされる可能性があります。
この論文では、シングルステップの敵対的例を初めてデータ情報と自己情報に分離し、「自己適合」と呼ばれる興味深い現象を明らかにします。
自己フィッティング、つまり、ネットワークは単一ステップの摂動に埋め込まれた自己情報を学習し、自然に CO の発生につながります。自己フィッティングが発生すると、ネットワークは明らかな「チャネル分化」現象を経験します。
自己情報の認識が支配的になり、データ情報に対する他の認識が抑制されます。
このように、ネットワークは十分な自己情報を持つ画像のみを認識でき、他のタイプのデータへの一般化機能を失います。
自己適合に基づいて、CO を軽減し、CO を多段階の敵対的トレーニングに拡張する既存の方法に新しい洞察を提供します。
私たちの調査結果は、敵対的トレーニングにおける自己学習メカニズムを明らかにし、さまざまな種類の情報を抑制して CO を軽減するための新しい展望を開きます。

要約(オリジナル)

Although fast adversarial training provides an efficient approach for building robust networks, it may suffer from a serious problem known as catastrophic overfitting (CO), where multi-step robust accuracy suddenly collapses to zero. In this paper, we for the first time decouple single-step adversarial examples into data-information and self-information, which reveals an interesting phenomenon called ‘self-fitting’. Self-fitting, i.e., the network learns the self-information embedded in single-step perturbations, naturally leads to the occurrence of CO. When self-fitting occurs, the network experiences an obvious ‘channel differentiation’ phenomenon that some convolution channels accounting for recognizing self-information become dominant, while others for data-information are suppressed. In this way, the network can only recognize images with sufficient self-information and loses generalization ability to other types of data. Based on self-fitting, we provide new insights into the existing methods to mitigate CO and extend CO to multi-step adversarial training. Our findings reveal a self-learning mechanism in adversarial training and open up new perspectives for suppressing different kinds of information to mitigate CO.

arxiv情報

著者	Zhengbao He,Tao Li,Sizhe Chen,Xiaolin Huang
発行日	2023-03-24 13:40:27+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

Investigating Catastrophic Overfitting in Fast Adversarial Training: A Self-fitting Perspective

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー