Evaluating Gender Bias of Pre-trained Language Models in Natural Language Inference by Considering All Labels

要約

差別的なジェンダーバイアスが、複数の言語の事前トレーニング済み言語モデル (PLM) で見つかっています。
自然言語推論 (NLI) では、既存のバイアス評価手法は、3 つのラベルのうちの特定のラベル (中立など) の予測結果に焦点を当てていました。
ただし、一意の偏った推論は一意の予測ラベルに関連付けられるため、このような評価方法は不正確になる可能性があります。
この制限に対処して、NLI タスクの 3 つのラベルすべてを考慮する PLM のバイアス評価方法を提案します。
さまざまな種類のバイアスを表す 3 つの評価データグループを作成します。
次に、各データグループの対応するラベル出力に基づいてバイアス測定を定義します。
実験では、NLI バイアス尺度のメタ評価手法を導入し、それを使用して、バイアス尺度がバイアスのある誤った推論とバイアスのない誤った推論をベースラインよりも区別できることを確認し、その結果、より正確なバイアス評価が得られます。
英語、日本語、中国語でデータセットを作成する際、複数の言語にわたるバイアス測定の互換性も検証します。
最後に、各言語の PLM のバイアス傾向を観察します。
私たちの知る限りでは、日本語と中国語で評価データセットを構築し、NLI から PLM のバイアスを測定したのは当社が初めてです。

要約(オリジナル)

Discriminatory gender biases have been found in Pre-trained Language Models (PLMs) for multiple languages. In Natural Language Inference (NLI), existing bias evaluation methods have focused on the prediction results of a specific label out of three labels, such as neutral. However, such evaluation methods can be inaccurate since unique biased inferences are associated with unique prediction labels. Addressing this limitation, we propose a bias evaluation method for PLMs that considers all the three labels of NLI task. We create three evaluation data groups that represent different types of biases. Then, we define a bias measure based on the corresponding label output of each data group. In the experiments, we introduce a meta-evaluation technique for NLI bias measures and use it to confirm that our bias measure can distinguish biased, incorrect inferences from non-biased incorrect inferences better than the baseline, resulting in a more accurate bias evaluation. As we create the datasets in English, Japanese, and Chinese, we also validate the compatibility of our bias measure across multiple languages. Lastly, we observe the bias tendencies in PLMs of each language. To our knowledge, we are the first to construct evaluation datasets and measure PLMs’ bias from NLI in Japanese and Chinese.

arxiv情報

著者	Panatchakorn Anantaprayoon,Masahiro Kaneko,Naoaki Okazaki
発行日	2024-02-21 16:54:35+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

Evaluating Gender Bias of Pre-trained Language Models in Natural Language Inference by Considering All Labels

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー