EM-MIAs: Enhancing Membership Inference Attacks in Large Language Models through Ensemble Modeling

要約

大規模言語モデル (LLM) が広く適用されるにつれ、モデルトレーニングデータのプライバシー漏洩に対する懸念がますます注目されるようになりました。
メンバーシップ推論攻撃 (MIA) は、これらのモデルに関連するプライバシーリスクを評価するための重要なツールとして登場しました。
LOSS、参照ベース、min-k、zlib などの既存の攻撃手法は、特定のシナリオではうまく機能しますが、大規模な事前トレーニング済み言語モデルに対するその有効性は、特に大規模なデータセットやコンテキストのコンテキストでは、ランダムな推測に近づくことがよくあります。
シングルエポックトレーニング。
この問題に対処するために、この論文では、いくつかの既存の MIA 手法 (LOSS、リファレンスベース、min-k、zlib) を XGBoost ベースのモデルに統合して、全体的な攻撃パフォーマンス (EM-MIA) を強化する新しいアンサンブル攻撃方法を提案します。
実験結果は、アンサンブルモデルが、さまざまな大規模な言語モデルとデータセットにわたる個別の攻撃方法と比較して、AUC-ROC と精度の両方を大幅に向上させることを示しています。
これは、さまざまな方法の長所を組み合わせることで、モデルのトレーニングデータのメンバーをより効果的に特定できるため、LLM のプライバシーリスクを評価するためのより堅牢なツールが提供されることを示しています。
この研究は、LLM プライバシー保護の分野におけるさらなる研究に新たな方向性を提示し、より強力なプライバシー監査方法を開発する必要性を強調しています。

要約(オリジナル)

With the widespread application of large language models (LLM), concerns about the privacy leakage of model training data have increasingly become a focus. Membership Inference Attacks (MIAs) have emerged as a critical tool for evaluating the privacy risks associated with these models. Although existing attack methods, such as LOSS, Reference-based, min-k, and zlib, perform well in certain scenarios, their effectiveness on large pre-trained language models often approaches random guessing, particularly in the context of large-scale datasets and single-epoch training. To address this issue, this paper proposes a novel ensemble attack method that integrates several existing MIAs techniques (LOSS, Reference-based, min-k, zlib) into an XGBoost-based model to enhance overall attack performance (EM-MIAs). Experimental results demonstrate that the ensemble model significantly improves both AUC-ROC and accuracy compared to individual attack methods across various large language models and datasets. This indicates that by combining the strengths of different methods, we can more effectively identify members of the model’s training data, thereby providing a more robust tool for evaluating the privacy risks of LLM. This study offers new directions for further research in the field of LLM privacy protection and underscores the necessity of developing more powerful privacy auditing methods.

arxiv情報

著者	Zichen Song,Sitan Huang,Zhongfeng Kang
発行日	2024-12-23 03:47:54+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

EM-MIAs: Enhancing Membership Inference Attacks in Large Language Models through Ensemble Modeling

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー