Technical Report on the Pangram AI-Generated Text Classifier

要約

パングラムテキストは、大規模な言語モデルによって書かれたテキストと人間によって書かれたテキストを区別するように訓練されたトランスフォーマーベースのニューラルネットワークです。
Pangram Text は、DetectGPT などのゼロショット手法や主要な商用 AI 検出ツールよりも優れたパフォーマンスを示し、10 のテキスト領域 (学生向けの文章、創造的な文章、科学的な文章、書籍、百科事典、ニュース、
電子メール、科学論文、短い形式の Q&A)、および 8 つのオープンソースおよびクローズドソースの大規模言語モデル。
私たちは、合成ミラーを使用したハードネガティブマイニングというトレーニングアルゴリズムを提案します。これにより、分類子はレビューなどの高データドメインで桁違いに低い誤検知率を達成できるようになります。
最後に、パングラムテキストが英語を母国語としない話者に対して偏見を持たず、トレーニング中には見られなかった領域やモデルに一般化されることを示します。

要約(オリジナル)

We present Pangram Text, a transformer-based neural network trained to distinguish text written by large language models from text written by humans. Pangram Text outperforms zero-shot methods such as DetectGPT as well as leading commercial AI detection tools with over 38 times lower error rates on a comprehensive benchmark comprised of 10 text domains (student writing, creative writing, scientific writing, books, encyclopedias, news, email, scientific papers, short-form Q&A) and 8 open- and closed-source large language models. We propose a training algorithm, hard negative mining with synthetic mirrors, that enables our classifier to achieve orders of magnitude lower false positive rates on high-data domains such as reviews. Finally, we show that Pangram Text is not biased against nonnative English speakers and generalizes to domains and models unseen during training.

arxiv情報

著者	Bradley Emi,Max Spero
発行日	2024-07-29 08:27:34+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

Technical Report on the Pangram AI-Generated Text Classifier

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー