Evaluating open-source Large Language Models for automated fact-checking

要約

オンラインの誤った情報の増加の増加により、自動化された事実確認ソリューションの需要が高まりました。
大規模な言語モデル（LLM）は、このタスクを支援するための潜在的なツールとして浮上していますが、それらの有効性は不確実なままです。
この研究では、さまざまなオープンソースLLMの事実確認能力を評価し、異なるレベルのコンテキスト情報でクレームを評価する能力に焦点を当てています。
（1）LLMSがクレームと事実確認記事の間の意味関係を特定できるかどうか、（2）関連する事実確認記事を与えられたときの主張を検証する際のモデルの精度を評価できるかどうか、およびGoogleやWikiediaなどの外部知識源からのデータを活用する際のLLMSのファクトチェックアビリティをテストすることができるかどうかを評価します。
我々の結果は、LLMSがクレームアーティクルの接続を特定し、事実確認されたストーリーを検証するのにうまく機能しますが、ロベルタなどの伝統的な微調整されたモデルによってアウトパフォームされている事実のニュースを確認するのに苦労しています。
さらに、外部の知識の導入は、LLMSのパフォーマンスを大幅に向上させることはなく、よりカスタマイズされたアプローチを必要とします。
私たちの調査結果は、自動化されたファクトチェックにおけるLLMの潜在能力と制限の両方を強調し、人間のファクトチェッカーを確実に置き換える前に、さらなる改良の必要性を強調しています。

要約(オリジナル)

The increasing prevalence of online misinformation has heightened the demand for automated fact-checking solutions. Large Language Models (LLMs) have emerged as potential tools for assisting in this task, but their effectiveness remains uncertain. This study evaluates the fact-checking capabilities of various open-source LLMs, focusing on their ability to assess claims with different levels of contextual information. We conduct three key experiments: (1) evaluating whether LLMs can identify the semantic relationship between a claim and a fact-checking article, (2) assessing models’ accuracy in verifying claims when given a related fact-checking article, and (3) testing LLMs’ fact-checking abilities when leveraging data from external knowledge sources such as Google and Wikipedia. Our results indicate that LLMs perform well in identifying claim-article connections and verifying fact-checked stories but struggle with confirming factual news, where they are outperformed by traditional fine-tuned models such as RoBERTa. Additionally, the introduction of external knowledge does not significantly enhance LLMs’ performance, calling for more tailored approaches. Our findings highlight both the potential and limitations of LLMs in automated fact-checking, emphasizing the need for further refinements before they can reliably replace human fact-checkers.

arxiv情報

著者	Nicolo’ Fontana,Francesco Corso,Enrico Zuccolotto,Francesco Pierri
発行日	2025-03-07 16:45:33+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

Evaluating open-source Large Language Models for automated fact-checking

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー