MoralBERT: A Fine-Tuned Language Model for Capturing Moral Values in Social Discussions

要約

道徳的価値観は、重要な社会問題に関して私たちが情報を評価し、決定を下し、判断を下す方法において基本的な役割を果たします。
ワクチン接種、中絶、人種差別、性的指向などの物議を醸すトピックは、証拠のみに基づいているのではなく、むしろ道徳的世界観を反映した意見や態度を引き出すことがよくあります。
自然言語処理 (NLP) の最近の進歩により、人間が生成したテキストコンテンツで道徳的価値を測定できることが示されています。
この論文は、道徳基礎理論 (MFT) に基づいて、社会的議論における道徳的感情を捉えるために微調整された一連の言語表現モデルである MoralBERT を紹介します。
Twitter (現在の X)、Reddit、Facebook をソースとする複数の異種 MFT 人間による注釈付きデータセットに対する、集約トレーニングとドメイン敵対トレーニングの両方のためのフレームワークについて説明します。これにより、ソーシャルメディア視聴者の関心、コンテンツのプレゼンテーション、スタイルの観点からテキストコンテンツの多様性が広がります。
そして広がるパターン。
提案されたフレームワークは、レキシコンベースのアプローチ、Word2Vec 埋め込み、およびドメイン内推論用の GPT-4 などの大規模言語モデルを使用したゼロショット分類よりも 11% ～ 32% 高い平均 F1 スコアを達成することを示します。
ドメイン敵対的トレーニングは、ゼロショット学習と同等のパフォーマンスを達成しながら、集約トレーニングよりも優れたドメイン外予測を生成します。
私たちのアプローチは、注釈なしで効果的な道徳学習に貢献し、NLP を使用した物議を醸す社会的議論における道徳的物語のより包括的な理解に向けた有用な洞察を提供します。

要約(オリジナル)

Moral values play a fundamental role in how we evaluate information, make decisions, and form judgements around important social issues. Controversial topics, including vaccination, abortion, racism, and sexual orientation, often elicit opinions and attitudes that are not solely based on evidence but rather reflect moral worldviews. Recent advances in Natural Language Processing (NLP) show that moral values can be gauged in human-generated textual content. Building on the Moral Foundations Theory (MFT), this paper introduces MoralBERT, a range of language representation models fine-tuned to capture moral sentiment in social discourse. We describe a framework for both aggregated and domain-adversarial training on multiple heterogeneous MFT human-annotated datasets sourced from Twitter (now X), Reddit, and Facebook that broaden textual content diversity in terms of social media audience interests, content presentation and style, and spreading patterns. We show that the proposed framework achieves an average F1 score that is between 11% and 32% higher than lexicon-based approaches, Word2Vec embeddings, and zero-shot classification with large language models such as GPT-4 for in-domain inference. Domain-adversarial training yields better out-of domain predictions than aggregate training while achieving comparable performance to zero-shot learning. Our approach contributes to annotation-free and effective morality learning, and provides useful insights towards a more comprehensive understanding of moral narratives in controversial social debates using NLP.

arxiv情報

著者	Vjosa Preniqi,Iacopo Ghinassi,Julia Ive,Charalampos Saitis,Kyriaki Kalimeri
発行日	2024-07-19 15:27:35+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

MoralBERT: A Fine-Tuned Language Model for Capturing Moral Values in Social Discussions

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー