Non-discrimination Criteria for Generative Language Models

要約

大規模言語モデルなどの生成 AI は、近年急速に開発されています。
これらのモデルが一般に公開されるようになるにつれ、アプリケーションにおける有害なバイアスが永続化し、増幅するのではないかという懸念が生じます。
ジェンダーに関する固定観念は、それが不当表示であれ差別であれ、ターゲットとなる個人にとって有害で制限的なものとなる可能性があります。
この論文では、ジェンダーバイアスが蔓延する社会構造であることを認識し、生成言語モデルにおけるジェンダーバイアスの存在を明らかにし、定量化する方法を研究します。
特に、分類から 3 つのよく知られた非差別基準、つまり独立性、分離、十分性の生成 AI 類似物を導き出します。
これらの基準が実際に動作していることを実証するために、私たちは職業上の性別の固定観念に焦点を当てて各基準のプロンプトを設計し、特に医療検査を利用して生成 AI コンテキストにグラウンドトゥルースを導入します。
私たちの結果は、そのような会話言語モデル内での職業上のジェンダーバイアスの存在に対処しています。

要約(オリジナル)

Generative AI, such as large language models, has undergone rapid development within recent years. As these models become increasingly available to the public, concerns arise about perpetuating and amplifying harmful biases in applications. Gender stereotypes can be harmful and limiting for the individuals they target, whether they consist of misrepresentation or discrimination. Recognizing gender bias as a pervasive societal construct, this paper studies how to uncover and quantify the presence of gender biases in generative language models. In particular, we derive generative AI analogues of three well-known non-discrimination criteria from classification, namely independence, separation and sufficiency. To demonstrate these criteria in action, we design prompts for each of the criteria with a focus on occupational gender stereotype, specifically utilizing the medical test to introduce the ground truth in the generative AI context. Our results address the presence of occupational gender bias within such conversational language models.

arxiv情報

著者	Sara Sterlie,Nina Weng,Aasa Feragen
発行日	2024-08-26 09:35:39+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

Non-discrimination Criteria for Generative Language Models

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー