The Unseen Targets of Hate — A Systematic Review of Hateful Communication Datasets

要約

機械学習 (ML) ベースのコンテンツモデレーションツールは、オンラインスペースを憎しみに満ちたコミュニケーションから守るために不可欠です。
ただし、ML ツールは、トレーニングに使用されるデータの品質が許す限りの能力しか発揮できません。
特定のアイデンティティに向けられた憎しみに満ちたコミュニケーションや差別の可能性を検出する能力が低下しているという証拠が増えている一方で、そのような偏見の起源については驚くほどほとんどわかっていません。
このギャップを埋めるために、過去 10 年間に導入された憎しみに満ちたコミュニケーションの自動検出のためのデータセットの系統的なレビューを提示し、データセットが具体化するアイデンティティ、つまり憎しみに満ちたコミュニケーションのターゲットのアイデンティティの観点からデータセットの品質を解明します。
データキュレーターが注目したデータだけでなく、データセットに意図せず含まれていたデータも含まれます。
全体として、選択されたターゲットのアイデンティティの偏った表現と、研究によって概念化され、最終的にデータセットに含まれるターゲット間の不一致が見つかりました。
しかし、データセットの言語と起源の場所でこれらの発見を文脈化することにより、この研究空間の拡大と多様化に向けた前向きな傾向を強調します。

要約(オリジナル)

Machine learning (ML)-based content moderation tools are essential to keep online spaces free from hateful communication. Yet, ML tools can only be as capable as the quality of the data they are trained on allows them. While there is increasing evidence that they underperform in detecting hateful communications directed towards specific identities and may discriminate against them, we know surprisingly little about the provenance of such bias. To fill this gap, we present a systematic review of the datasets for the automated detection of hateful communication introduced over the past decade, and unpack the quality of the datasets in terms of the identities that they embody: those of the targets of hateful communication that the data curators focused on, as well as those unintentionally included in the datasets. We find, overall, a skewed representation of selected target identities and mismatches between the targets that research conceptualizes and ultimately includes in datasets. Yet, by contextualizing these findings in the language and location of origin of the datasets, we highlight a positive trend towards the broadening and diversification of this research space.

arxiv情報

著者	Zehui Yu,Indira Sen,Dennis Assenmacher,Mattia Samory,Leon Fröhling,Christina Dahn,Debora Nozza,Claudia Wagner
発行日	2024-05-14 12:50:33+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

The Unseen Targets of Hate — A Systematic Review of Hateful Communication Datasets

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー