Automated Claim Matching with Large Language Models: Empowering Fact-Checkers in the Fight Against Misinformation

要約

今日のデジタル時代では、誤った情報の急速な拡散が国民の幸福と社会的信頼に脅威を与えています。
オンラインでの誤った情報が急増するにつれて、ファクトチェッカーによる手動検証はますます困難になっています。
大規模言語モデル (LLM) を使用してファクトチェックのクレームマッチングフェーズを自動化するように設計されたフレームワークである、FACT-GPT (クレームマッチングによるファクトチェック拡張、タスク指向生成事前トレーニングトランスフォーマー) を紹介します。
このフレームワークは、ファクトチェッカーによって以前に誤りが暴かれた主張を支持または否定する新しいソーシャルメディアコンテンツを特定します。
私たちのアプローチでは GPT-4 を使用して、シミュレートされたソーシャルメディア投稿から構成されるラベル付きデータセットを生成します。
このデータセットは、より専門化された LLM を微調整するためのトレーニングの場として機能します。
私たちは、公衆衛生に関連するソーシャルメディアコンテンツの広範なデータセットに基づいて FACT-GPT を評価しました。
結果は、私たちの微調整された LLM が、クレーム照合タスクにおいて事前トレーニングされた大規模な LLM のパフォーマンスに匹敵し、人間の注釈と密接に一致していることを示しています。
この研究は 3 つの主要なマイルストーンを達成しました。強化された事実確認のための自動化されたフレームワークを提供します。
人間の専門知識を補完する LLM の可能性を実証します。
ファクトチェック領域でのさらなる研究と応用のために、データセットやモデルなどの公開リソースを提供します。

要約(オリジナル)

In today’s digital era, the rapid spread of misinformation poses threats to public well-being and societal trust. As online misinformation proliferates, manual verification by fact checkers becomes increasingly challenging. We introduce FACT-GPT (Fact-checking Augmentation with Claim matching Task-oriented Generative Pre-trained Transformer), a framework designed to automate the claim matching phase of fact-checking using Large Language Models (LLMs). This framework identifies new social media content that either supports or contradicts claims previously debunked by fact-checkers. Our approach employs GPT-4 to generate a labeled dataset consisting of simulated social media posts. This data set serves as a training ground for fine-tuning more specialized LLMs. We evaluated FACT-GPT on an extensive dataset of social media content related to public health. The results indicate that our fine-tuned LLMs rival the performance of larger pre-trained LLMs in claim matching tasks, aligning closely with human annotations. This study achieves three key milestones: it provides an automated framework for enhanced fact-checking; demonstrates the potential of LLMs to complement human expertise; offers public resources, including datasets and models, to further research and applications in the fact-checking domain.

arxiv情報

著者	Eun Cheol Choi,Emilio Ferrara
発行日	2023-10-13 16:21:07+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

Automated Claim Matching with Large Language Models: Empowering Fact-Checkers in the Fight Against Misinformation

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー