Does Differential Privacy Impact Bias in Pretrained NLP Models?

要約

差分プライバシー (DP) は、トレーニング例の漏洩を制限するために、事前トレーニングされた大規模言語モデル (LLM) を微調整するときに適用されます。
ほとんどの DP 研究は、モデルのプライバシーとユーティリティのトレードオフの改善に焦点を当てていますが、一部の研究者は、DP が過小評価されたグループに対して不公平であったり、偏ったりする可能性があることを発見しています。
この研究では、実証分析を通じて LLM のバイアスに対する DP の影響を示します。
差分プライベートトレーニングでは、AUC ベースのバイアスメトリクスに基づく保護されたグループに対するモデルのバイアスが増加する可能性があります。
DP により、モデルが保護されたグループと残りの集団の他のグループからのポジティブな例とネガティブな例を区別することがより困難になります。
また、私たちの結果は、DP がバイアスに及ぼす影響は、プライバシー保護レベルだけでなく、データセットの基礎となる分布にも影響を受けることを示しています。

要約(オリジナル)

Differential privacy (DP) is applied when fine-tuning pre-trained large language models (LLMs) to limit leakage of training examples. While most DP research has focused on improving a model’s privacy-utility tradeoff, some find that DP can be unfair to or biased against underrepresented groups. In this work, we show the impact of DP on bias in LLMs through empirical analysis. Differentially private training can increase the model bias against protected groups w.r.t AUC-based bias metrics. DP makes it more difficult for the model to differentiate between the positive and negative examples from the protected groups and other groups in the rest of the population. Our results also show that the impact of DP on bias is not only affected by the privacy protection level but also the underlying distribution of the dataset.

arxiv情報

著者	Md. Khairul Islam,Andrew Wang,Tianhao Wang,Yangfeng Ji,Judy Fox,Jieyu Zhao
発行日	2024-10-24 13:59:03+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

Does Differential Privacy Impact Bias in Pretrained NLP Models?

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー