Leveraging Explainable AI for LLM Text Attribution: Differentiating Human-Written and Multiple LLMs-Generated Text

要約

生成 AI 大規模言語モデル (LLM) の開発により、生成 AI または人間によって生成されたコンテンツの識別に関する警鐘が鳴らされました。
あるケースでは、学生がそのようなツールに大きく依存し、執筆やコーディングのスキルの発達に影響を与える可能性がある場合に問題が発生します。
盗作に関する他の問題も当てはまります。
この研究は、LLM ツールを使用して生成されたテキストコンテンツを検出および識別する取り組みをサポートすることを目的としています。
私たちは、LLM によって生成されたテキストが機械学習 (ML) によって検出可能であるという仮説を立て、複数の LLM ツールによって生成されたテキストを認識して区別できる ML モデルを調査します。
私たちは、ランダムフォレスト (RF) やリカレントニューラルネットワーク (RNN) などのいくつかの ML およびディープラーニング (DL) アルゴリズムを活用し、アトリビューションの重要な機能を理解するために説明可能な人工知能 (XAI) を利用しました。
私たちの方法は、1) 人間が書いたテキストと AI テキストを区別するバイナリ分類と、2) 人間が書いたテキストと 5 つの異なる LLM ツール (ChatGPT、LLaMA、Google Bard によって生成されたテキスト) を区別する多分類に分かれています。
、クロード、パープレクシティ）。
結果は、多値分類および二値分類で高い精度を示しています。
私たちのモデルは、98.5\% から 78.3\% の精度で GPTZero を上回りました。
特に、GPTZero は観測値の約 4.2\% を認識できませんでしたが、私たちのモデルは完全なテストデータセットを認識できました。
XAI の結果は、さまざまなクラスにわたる機能の重要性を理解することで、詳細な作成者/ソースプロファイルが可能になることを示しました。
さらに、独自の文体要素や構造要素を強調することで、帰属を支援し、盗作検出をサポートし、確実なコンテンツの独創性検証を保証します。

要約(オリジナル)

The development of Generative AI Large Language Models (LLMs) raised the alarm regarding identifying content produced through generative AI or humans. In one case, issues arise when students heavily rely on such tools in a manner that can affect the development of their writing or coding skills. Other issues of plagiarism also apply. This study aims to support efforts to detect and identify textual content generated using LLM tools. We hypothesize that LLMs-generated text is detectable by machine learning (ML), and investigate ML models that can recognize and differentiate texts generated by multiple LLMs tools. We leverage several ML and Deep Learning (DL) algorithms such as Random Forest (RF), and Recurrent Neural Networks (RNN), and utilized Explainable Artificial Intelligence (XAI) to understand the important features in attribution. Our method is divided into 1) binary classification to differentiate between human-written and AI-text, and 2) multi classification, to differentiate between human-written text and the text generated by the five different LLM tools (ChatGPT, LLaMA, Google Bard, Claude, and Perplexity). Results show high accuracy in the multi and binary classification. Our model outperformed GPTZero with 98.5\% accuracy to 78.3\%. Notably, GPTZero was unable to recognize about 4.2\% of the observations, but our model was able to recognize the complete test dataset. XAI results showed that understanding feature importance across different classes enables detailed author/source profiles. Further, aiding in attribution and supporting plagiarism detection by highlighting unique stylistic and structural elements ensuring robust content originality verification.

arxiv情報

著者	Ayat Najjar,Huthaifa I. Ashqar,Omar Darwish,Eman Hammad
発行日	2025-01-06 18:46:53+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

Leveraging Explainable AI for LLM Text Attribution: Differentiating Human-Written and Multiple LLMs-Generated Text

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー