AMUN: Adversarial Machine UNlearning

要約

ユーザーが忘却データセットの削除を要求できるマシンUlderningは、多数のプライバシー規制のためにますます重要になっています。
「正確」の初期作業（例：再訓練）が大規模な計算オーバーヘッドが発生します。
ただし、計算的に安価ですが、「近似」方法は、正確な学習の有効性に達することに至りませんでした。
この観察を活用して、画像分類のための以前の最先端（SOTA）方法を上回る新しい未学習の方法である敵対機の敵対的なマシンUlderning（Amun）を提案します。
Amunは、対応する敵対的な例でモデルを微調整することにより、忘却サンプルのモデルの信頼を低下させます。
敵対的な例は、入力空間にモデルによって課される分布に自然に属します。
対応する忘却サンプルに最も近い敵の例でモデルを微調整すると、（a）各忘却サンプルの周りのモデルの決定境界の変更を局所化し、（b）モデルのグローバルな動作に対する劇的な変化を回避し、それによってテストサンプルに対するモデルの精度を維持します。
AMUNを使用して、CIFAR-10サンプルのランダムな$ 10 \％$を学習するために、SOTAメンバーシップ推論攻撃でさえランダム推測よりもうまくいかないことがわかります。

要約(オリジナル)

Machine unlearning, where users can request the deletion of a forget dataset, is becoming increasingly important because of numerous privacy regulations. Initial works on “exact” unlearning (e.g., retraining) incur large computational overheads. However, while computationally inexpensive, “approximate” methods have fallen short of reaching the effectiveness of exact unlearning: models produced fail to obtain comparable accuracy and prediction confidence on both the forget and test (i.e., unseen) dataset. Exploiting this observation, we propose a new unlearning method, Adversarial Machine UNlearning (AMUN), that outperforms prior state-of-the-art (SOTA) methods for image classification. AMUN lowers the confidence of the model on the forget samples by fine-tuning the model on their corresponding adversarial examples. Adversarial examples naturally belong to the distribution imposed by the model on the input space; fine-tuning the model on the adversarial examples closest to the corresponding forget samples (a) localizes the changes to the decision boundary of the model around each forget sample and (b) avoids drastic changes to the global behavior of the model, thereby preserving the model’s accuracy on test samples. Using AMUN for unlearning a random $10\%$ of CIFAR-10 samples, we observe that even SOTA membership inference attacks cannot do better than random guessing.

arxiv情報

著者	Ali Ebrahimpour-Boroojeny,Hari Sundaram,Varun Chandrasekaran
発行日	2025-05-01 15:21:54+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

AMUN: Adversarial Machine UNlearning

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー