It’s All Relative: Interpretable Models for Scoring Bias in Documents

要約

我々は、テキストコンテンツのみに基づいて、ウェブドキュメントに存在するバイアスをスコア化する解釈可能なモデルを提案します。
私たちのモデルにはブラッドリー・テリーの公理を彷彿とさせる仮定が組み込まれており、同じウィキペディア記事の一方のバージョンが他方のバージョンよりも偏った改訂のペアでトレーニングされています。
絶対バイアス分類に基づく従来のアプローチでは、タスクの高い精度を得るのに苦労していましたが、バイアスのペアごとの比較を正確に実行する方法を学習することで、バイアスをスコアリングするための有用なモデルを開発できます。
トレーニングされたモデルのパラメーターを解釈して、バイアスを最も示す単語を発見できることを示します。
また、ウィキペディアの記事におけるバイアスの時間的変化の研究、バイアスに基づいたニュースソースの比較、法改正におけるバイアスのスコアリングという 3 つの異なる設定にもモデルを適用します。
それぞれのケースで、トレーニングデータドメインの外側にある 2 つのドメインであっても、モデルの出力を説明および検証できることを示します。
また、このモデルを使用して、ドメイン間の一般的な偏りのレベルを比較します。法的文書の偏りが最も少なく、ニュースメディアの偏りが最も大きく、その中間にウィキペディアの記事があることがわかります。
その高いパフォーマンス、シンプルさ、解釈可能性、幅広い適用性を考慮すると、このモデルがウィキペディアやニュース編集者、政治科学者や社会科学者、一般大衆を含む大規模なコミュニティにとって役立つことを期待しています。

要約(オリジナル)

We propose an interpretable model to score the bias present in web documents, based only on their textual content. Our model incorporates assumptions reminiscent of the Bradley-Terry axioms and is trained on pairs of revisions of the same Wikipedia article, where one version is more biased than the other. While prior approaches based on absolute bias classification have struggled to obtain a high accuracy for the task, we are able to develop a useful model for scoring bias by learning to perform pairwise comparisons of bias accurately. We show that we can interpret the parameters of the trained model to discover the words most indicative of bias. We also apply our model in three different settings – studying the temporal evolution of bias in Wikipedia articles, comparing news sources based on bias, and scoring bias in law amendments. In each case, we demonstrate that the outputs of the model can be explained and validated, even for the two domains that are outside the training-data domain. We also use the model to compare the general level of bias between domains, where we see that legal texts are the least biased and news media are the most biased, with Wikipedia articles in between. Given its high performance, simplicity, interpretability, and wide applicability, we hope the model will be useful for a large community, including Wikipedia and news editors, political and social scientists, and the general public.

arxiv情報

著者	Aswin Suresh,Chi-Hsuan Wu,Matthias Grossglauser
発行日	2023-07-16 19:35:38+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

It’s All Relative: Interpretable Models for Scoring Bias in Documents

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー