TOPFORMER: Topology-Aware Authorship Attribution of Deepfake Texts with Diverse Writing Styles

要約

大規模言語モデル (LLM) の最近の進歩により、人間が書いたテキストと区別するのが容易ではない、オープンエンドの高品質テキストの生成が可能になりました。
このような LLM によって生成されたテキストをディープフェイクテキストと呼びます。
現在、huggingface モデルリポジトリには 72,000 を超えるテキスト生成モデルがあります。
そのため、悪意のあるユーザーは、これらのオープンソース LLM を簡単に使用して、有害なテキストや偽情報/誤った情報を大規模に生成することができます。
この問題を軽減するには、特定のテキストがディープフェイクテキストであるかどうかを判断する計算手法、つまりチューリングテスト (TT) が必要です。
特に、この研究では、著者帰属 (AA) として知られる問題のより一般的なバージョンをマルチクラス設定で調査します。つまり、特定のテキストがディープフェイクテキストであるかどうかを判断するだけでなく、
どの LLM が作成者であるかを正確に特定できます。
私たちは、Transformer ベースのモデルにトポロジカルデータ分析 (TDA) レイヤーを含めることにより、ディープフェイクテキスト内のより多くの言語パターンをキャプチャすることで、既存の AA ソリューションを改善することを TopFormer に提案します。
バックボーンの再形成された $pooled\_output$ を入力として TDA 特徴を抽出することで、不均衡な複数スタイルのデータセットを扱うときに TDA レイヤーを使用する利点を示します。
この Transformer ベースのモデルはコンテキスト表現 (つまり、意味論的および構文的な言語特徴) をキャプチャし、一方、TDA はデータの形状と構造 (つまり、言語構造) をキャプチャします。
最後に、TopFormer は、3 つのデータセットすべてですべてのベースラインを上回り、マクロ F1 スコアで最大 7\% の増加を達成しました。

要約(オリジナル)

Recent advances in Large Language Models (LLMs) have enabled the generation of open-ended high-quality texts, that are non-trivial to distinguish from human-written texts. We refer to such LLM-generated texts as deepfake texts. There are currently over 72K text generation models in the huggingface model repo. As such, users with malicious intent can easily use these open-sourced LLMs to generate harmful texts and dis/misinformation at scale. To mitigate this problem, a computational method to determine if a given text is a deepfake text or not is desired–i.e., Turing Test (TT). In particular, in this work, we investigate the more general version of the problem, known as Authorship Attribution (AA), in a multi-class setting–i.e., not only determining if a given text is a deepfake text or not but also being able to pinpoint which LLM is the author. We propose TopFormer to improve existing AA solutions by capturing more linguistic patterns in deepfake texts by including a Topological Data Analysis (TDA) layer in the Transformer-based model. We show the benefits of having a TDA layer when dealing with imbalanced, and multi-style datasets, by extracting TDA features from the reshaped $pooled\_output$ of our backbone as input. This Transformer-based model captures contextual representations (i.e., semantic and syntactic linguistic features), while TDA captures the shape and structure of data (i.e., linguistic structures). Finally, TopFormer, outperforms all baselines in all 3 datasets, achieving up to 7\% increase in Macro F1 score.

arxiv情報

著者	Adaku Uchendu,Thai Le,Dongwon Lee
発行日	2024-04-09 11:27:48+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

TOPFORMER: Topology-Aware Authorship Attribution of Deepfake Texts with Diverse Writing Styles

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー