LTCR: Long-Text Chinese Rumor Detection Dataset

要約

私たちが開発した長文中国の噂検出データセットは、噂検証の文脈における誤解を招く情報の特定に焦点を当てています。
特に現在の新型コロナウイルス感染症パンデミックの時代では、誤った情報がソーシャルメディアプラットフォーム上で急速に拡散し、人々の健康行動や健康上の緊急事態への対応に悪影響を与える可能性があります。
LTCR データセットは、正確な誤情報検出のためのリソースを提供することで、フェイクニュース、特に長く複雑なテキストの識別を向上させるためのリソースを提供します。
このデータセットは、それぞれ 1,729 件の本物のニュースと 500 件のフェイクニュースで構成されています。
本物のニュースとフェイクニュースの平均長は約 230 文字と 152 文字です。
また、データセット上で最高の精度 (95.85%)、フェイクニュース再現率 (90.91%)、および F スコア (90.60%) を達成する、Salience を意識したフェイクニュース検出モデルというメソッドも提案します。(https://github.
com/Enderfga/DoubleCheck)

要約(オリジナル)

The Long-Text Chinese Rumor detection dataset we developed is focusing on the identification of misleading information in the context of rumor verification. Especially in the current era of the COVID-19 pandemic, false information spread rapidly on social media platforms and can negatively impact people’s health behaviors and responses to health emergencies. By providing a resource for accurate misinformation detection, the LTCR dataset offers a resource for improving the identification of fake news, particularly longer and more complex texts. The dataset consists of 1,729 and 500 pieces of real and fake news, respectively. The average lengths of real and fake news are approximately 230 and 152 characters. We also propose \method, Salience-aware Fake News Detection Model, which achieves the highest accuracy (95.85%), fake news recall (90.91%) and F-score (90.60%) on the dataset.(https://github.com/Enderfga/DoubleCheck)

arxiv情報

著者	Ziyang Ma,Mengsha Liu,Guian Fang,Ying Shen
発行日	2023-06-12 16:03:36+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

LTCR: Long-Text Chinese Rumor Detection Dataset

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー