RAUCG: Retrieval-Augmented Unsupervised Counter Narrative Generation for Hate Speech

要約

Counter Narrative (CN) は、言論の自由を侵害することなく、オンラインヘイトスピーチ (HS) と戦うための有望なアプローチです。
近年、自然言語生成技術を使用して CN を自動生成することへの関心が高まっています。
ただし、現在の自動 CN 生成方法は主に専門家が作成したトレーニング用のデータセットに依存しており、取得には時間と労力がかかります。
さらに、これらの方法では、外部の統計、事実、例から反知を直接取得して拡張することはできません。
これらの制限に対処するために、外部のカウンターナレッジを自動的に拡張し、それを教師なしパラダイムでCNにマッピングする、検索拡張された教師なしカウンターナラティブ生成（RAUCG）を提案します。
具体的には、まず、スタンスの一貫性、意味重複率、HS への適合度といった多面的な観点から反知を検索する SSF 検索手法を導入する。
次に、知識の注入、カウンターおよび流暢性の制約を微分可能な関数に量子化することでエネルギーベースのデコードメカニズムを設計し、専門家が作成した CN データがなくてもモデルがカウンター知識から CN へのマッピングを構築できるようにします。
最後に、言語の品質、毒性、説得力、関連性、HS 対策の成功率などの観点からモデルのパフォーマンスを包括的に評価します。実験結果は、RAUCG がすべての指標で強力なベースラインを上回り、より強力な一般化機能を示し、+ の大幅な改善を達成したことを示しています。
関連性が 2.0%、対抗指標の成功率が +4.5%。
さらに、RAUCG により、後者が前者よりも約 8 倍大きいにもかかわらず、GPT2 はすべてのメトリクスで T0 を上回るパフォーマンスを実現しました。
警告: この文書には攻撃的または不快な内容が含まれている可能性があります。

要約(オリジナル)

The Counter Narrative (CN) is a promising approach to combat online hate speech (HS) without infringing on freedom of speech. In recent years, there has been a growing interest in automatically generating CNs using natural language generation techniques. However, current automatic CN generation methods mainly rely on expert-authored datasets for training, which are time-consuming and labor-intensive to acquire. Furthermore, these methods cannot directly obtain and extend counter-knowledge from external statistics, facts, or examples. To address these limitations, we propose Retrieval-Augmented Unsupervised Counter Narrative Generation (RAUCG) to automatically expand external counter-knowledge and map it into CNs in an unsupervised paradigm. Specifically, we first introduce an SSF retrieval method to retrieve counter-knowledge from the multiple perspectives of stance consistency, semantic overlap rate, and fitness for HS. Then we design an energy-based decoding mechanism by quantizing knowledge injection, countering and fluency constraints into differentiable functions, to enable the model to build mappings from counter-knowledge to CNs without expert-authored CN data. Lastly, we comprehensively evaluate model performance in terms of language quality, toxicity, persuasiveness, relevance, and success rate of countering HS, etc. Experimental results show that RAUCG outperforms strong baselines on all metrics and exhibits stronger generalization capabilities, achieving significant improvements of +2.0% in relevance and +4.5% in success rate of countering metrics. Moreover, RAUCG enabled GPT2 to outperform T0 in all metrics, despite the latter being approximately eight times larger than the former. Warning: This paper may contain offensive or upsetting content!

arxiv情報

著者	Shuyu Jiang,Wenyi Tang,Xingshu Chen,Rui Tanga,Haizhou Wang,Wenxian Wang
発行日	2023-10-09 12:01:26+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

RAUCG: Retrieval-Augmented Unsupervised Counter Narrative Generation for Hate Speech

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー