The GUS Framework: Benchmarking Social Bias Classification with Discriminative (Encoder-Only) and Generative (Decoder-Only) Language Models

要約

テキストでの社会的バイアスの検出は、特にバイナリ分類方法の制限のために、重大な課題です。
これらの方法は、多くの場合、微妙なバイアスを過度に単純化し、コンテンツが「バイアス」または「公正」のいずれかと誤分類されると感情的な影響が高くなります。
これらの欠点に対処するために、ソーシャルバイアスの根底にある3つの重要な言語コンポーネント、一般化、不公平、ステレオタイプ（GUSフレームワーク）に焦点を当てたより微妙なフレームワークを提案します。
GUSフレームワークは、半自動化されたアプローチを採用して包括的な合成データセットを作成します。これは、倫理基準を維持するために人間によって検証されます。
このデータセットは、堅牢なマルチラベルトークン分類を可能にします。
識別（エンコーダーのみの）モデルと生成（自動動的な大手言語モデル）を組み合わせた方法論は、テキスト内の偏ったエンティティを識別します。
大規模な実験を通じて、エンコーダーのみのモデルがこの複雑なタスクに効果的であり、マクロとエンティティごとのF1スコアとハミング損失の両方の点で、しばしば最先端の方法を上回ることが多いことを実証します。
これらの調査結果は、さまざまなユースケースのモデルの選択を導き、さまざまなコンテキストで明示的および暗黙的なバイアスをキャプチャするGUSフレームワークの有効性を強調し、さまざまな分野での将来の研究とアプリケーションの経路を提供することができます。

要約(オリジナル)

The detection of social bias in text is a critical challenge, particularly due to the limitations of binary classification methods. These methods often oversimplify nuanced biases, leading to high emotional impact when content is misclassified as either ‘biased’ or ‘fair.’ To address these shortcomings, we propose a more nuanced framework that focuses on three key linguistic components underlying social bias: Generalizations, Unfairness, and Stereotypes (the GUS framework). The GUS framework employs a semi-automated approach to create a comprehensive synthetic dataset, which is then verified by humans to maintain ethical standards. This dataset enables robust multi-label token classification. Our methodology, which combines discriminative (encoder-only) models and generative (auto-regressive large language models), identifies biased entities in text. Through extensive experiments, we demonstrate that encoder-only models are effective for this complex task, often outperforming state-of-the-art methods, both in terms of macro and entity-wise F1-score and Hamming loss. These findings can guide the choice of model for different use cases, highlighting the GUS framework’s effectiveness in capturing explicit and implicit biases across diverse contexts, and offering a pathway for future research and applications in various fields.

arxiv情報

著者	Maximus Powers,Shaina Raza,Alex Chang,Umang Mavani,Harshitha Reddy Jonala,Ansh Tiwari,Hua Wei
発行日	2025-02-28 18:55:08+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

The GUS Framework: Benchmarking Social Bias Classification with Discriminative (Encoder-Only) and Generative (Decoder-Only) Language Models

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー