Deep Unlearning: Fast and Efficient Training-free Approach to Class Forgetting

要約

機械のアンラーニングは、ユーザーデータの削除に対する規制上の要求とプライバシー意識の高まりによって推進されている、顕著かつ困難な分野です。
既存のアプローチには、削除リクエストごとにモデルの再トレーニングや複数の微調整ステップが含まれており、計算量の制限やデータアクセスの制限によって制約されることがよくあります。
この研究では、学習されたモデルから特定のクラスを戦略的に削除するように設計された新しいクラス非学習アルゴリズムを導入します。
私たちのアルゴリズムは、まず、retain クラスと unlearn クラスからのサンプルの小さなサブセットの層ごとのアクティベーションに対して特異値分解を使用して、Retain スペースと Forget スペースをそれぞれ推定します。
次に、これらの空間間の共有情報を計算し、それを忘れ空間から削除して、クラス識別特徴空間を分離します。
最後に、活性化空間からクラス識別特徴を抑制するために重みを更新することによって未学習モデルを取得します。
Vision Transformer を使用して、ImageNet 上でアルゴリズムの有効性を実証します。元のモデルと比較して保持精度は $\sim 1.5\%$ 低下するだけであり、未学習クラスサンプルでは $1\%$ 未満の精度を維持します。
さらに、私たちのアルゴリズムはメンバーシップ推論攻撃を受けた場合でも一貫して良好なパフォーマンスを示し、さまざまな画像分類データセットおよびネットワークアーキテクチャにわたって平均で $\sim 6 \times$ 高い計算効率を示しながら、他のベースラインと比較して $7.8\%$ の改善を示しています。
私たちのコードは https://github.com/sangamesh-kodge/class_forgetting で入手できます。

要約(オリジナル)

Machine unlearning is a prominent and challenging field, driven by regulatory demands for user data deletion and heightened privacy awareness. Existing approaches involve retraining model or multiple finetuning steps for each deletion request, often constrained by computational limits and restricted data access. In this work, we introduce a novel class unlearning algorithm designed to strategically eliminate specific classes from the learned model. Our algorithm first estimates the Retain and the Forget Spaces using Singular Value Decomposition on the layerwise activations for a small subset of samples from the retain and unlearn classes, respectively. We then compute the shared information between these spaces and remove it from the forget space to isolate class-discriminatory feature space. Finally, we obtain the unlearned model by updating the weights to suppress the class discriminatory features from the activation spaces. We demonstrate our algorithm’s efficacy on ImageNet using a Vision Transformer with only $\sim 1.5\%$ drop in retain accuracy compared to the original model while maintaining under $1\%$ accuracy on the unlearned class samples. Further, our algorithm consistently performs well when subject to Membership Inference Attacks showing $7.8\%$ improvement on average across a variety of image classification datasets and network architectures, as compared to other baselines while being $\sim 6 \times$ more computationally efficient. Our code is available at https://github.com/sangamesh-kodge/class_forgetting.

arxiv情報

著者	Sangamesh Kodge,Gobinda Saha,Kaushik Roy
発行日	2024-05-07 15:26:02+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

Deep Unlearning: Fast and Efficient Training-free Approach to Class Forgetting

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー