Domain Invariant Adversarial Learning

要約

敵対的な例の現象は、ディープニューラルネットワークの最も基本的な脆弱性の 1 つを示しています。
この固有の弱点を克服するために導入されたさまざまな手法の中で、敵対的トレーニングは、堅牢なモデルを学習するための最も効果的な戦略として浮上しています。
通常、これは堅牢な目的と自然な目的のバランスを取ることによって達成されます。
この作業では、ドメイン不変の特徴表現を適用することにより、堅牢な精度と標準精度の間のトレードオフをさらに最適化することを目指しています。
堅牢でドメイン不変の特徴表現を学習する、新しい敵対的トレーニング方法であるドメイン不変敵対的学習 (DIAL) を紹介します。
DIAL は、自然ドメインとそれに対応する敵対的ドメインでドメイン敵対的ニューラルネットワーク (DANN) のバリアントを使用します。
ソースドメインが自然な例で構成され、ターゲットドメインが敵対的に摂動された例である場合、私たちの方法は、自然な例と敵対的な例を区別しないように制約された特徴表現を学習するため、より堅牢な表現を実現できます。
DIAL は、任意の敵対的トレーニング方法に簡単に組み込むことができる、一般的でモジュール化された手法です。
私たちの実験は、敵対的トレーニングプロセスにDIALを組み込むことで、堅牢性と標準精度の両方が向上することを示しています。

要約(オリジナル)

The phenomenon of adversarial examples illustrates one of the most basic vulnerabilities of deep neural networks. Among the variety of techniques introduced to surmount this inherent weakness, adversarial training has emerged as the most effective strategy for learning robust models. Typically, this is achieved by balancing robust and natural objectives. In this work, we aim to further optimize the trade-off between robust and standard accuracy by enforcing a domain-invariant feature representation. We present a new adversarial training method, Domain Invariant Adversarial Learning (DIAL), which learns a feature representation that is both robust and domain invariant. DIAL uses a variant of Domain Adversarial Neural Network (DANN) on the natural domain and its corresponding adversarial domain. In the case where the source domain consists of natural examples and the target domain is the adversarially perturbed examples, our method learns a feature representation constrained not to discriminate between the natural and adversarial examples, and can therefore achieve a more robust representation. DIAL is a generic and modular technique that can be easily incorporated into any adversarial training method. Our experiments indicate that incorporating DIAL in the adversarial training process improves both robustness and standard accuracy.

arxiv情報

著者	Matan Levi,Idan Attias,Aryeh Kontorovich
発行日	2022-09-13 10:03:47+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

Domain Invariant Adversarial Learning

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー