Towards Fairness-Aware Adversarial Learning

要約

敵対的トレーニング (AT) はモデルの堅牢性を高めるのに効果的であることが証明されていますが、最近明らかになった堅牢性の公平性の問題は十分に対処されていません。つまり、堅牢な精度はカテゴリによって大きく異なります。
このペーパーでは、モデルの平均クラスパフォーマンスを一律に評価するのではなく、さまざまなクラスにわたる最悪の場合の分布を考慮することで、堅牢な公平性の問題を詳しく掘り下げます。
私たちは、Fairness-Aware Adversarial Learning (FAAL) と呼ばれる新しい学習パラダイムを提案します。
従来の AT の一般化として、トレーニングされたモデルの堅牢性と公平性の両方を確保するために、敵対的トレーニングの問題を min-max-max フレームワークとして再定義します。
具体的には、分布ロバスト最適化を利用することにより、私たちの方法は、さまざまなカテゴリ間で最悪の分布を見つけることを目的としており、解は高い確率で上限のパフォーマンスを取得することが保証されています。
特に、FAAL は、全体的なクリーンでロバストな精度を損なうことなく、不公平でロバストなモデルをわずか 2 エポック内で公平になるように微調整できます。
さまざまな画像データセットに対する広範な実験により、他の最先端の方法と比較して、提案された FAAL の優れたパフォーマンスと効率が検証されました。

要約(オリジナル)

Although adversarial training (AT) has proven effective in enhancing the model’s robustness, the recently revealed issue of fairness in robustness has not been well addressed, i.e. the robust accuracy varies significantly among different categories. In this paper, instead of uniformly evaluating the model’s average class performance, we delve into the issue of robust fairness, by considering the worst-case distribution across various classes. We propose a novel learning paradigm, named Fairness-Aware Adversarial Learning (FAAL). As a generalization of conventional AT, we re-define the problem of adversarial training as a min-max-max framework, to ensure both robustness and fairness of the trained model. Specifically, by taking advantage of distributional robust optimization, our method aims to find the worst distribution among different categories, and the solution is guaranteed to obtain the upper bound performance with high probability. In particular, FAAL can fine-tune an unfair robust model to be fair within only two epochs, without compromising the overall clean and robust accuracies. Extensive experiments on various image datasets validate the superior performance and efficiency of the proposed FAAL compared to other state-of-the-art methods.

arxiv情報

著者	Yanghao Zhang,Tianle Zhang,Ronghui Mu,Xiaowei Huang,Wenjie Ruan
発行日	2024-02-27 18:01:59+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

Towards Fairness-Aware Adversarial Learning

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー