Certified Data Removal Under High-dimensional Settings

要約

Machine Ulearningは、トレーニングされたモデルからの特定のトレーニングデータの計算効率的な削除に焦点を当てており、完全な再訓練を必要とせずに忘れられたデータの影響が効果的に排除されるようにします。
低次元設定での進歩にもかかわらず、パラメーターの数\（p \）がサンプルサイズ\（n \）よりもはるかに少ない場合、高次元レジームに同様の理論的保証を拡張することは依然として困難です。
元のモデルパラメーターから始まり、ニュートンステップの理論ガイド付きシーケンスを実行する未学習アルゴリズムを提案します（\ {1,2 \} \）。
この更新の後、慎重にスケーリングされた等方性ラプラシアンノイズが推定に追加され、忘却データの（潜在的な）残差が完全に削除されるようにします。
固定比\（n/p \）を持つ\（n、p \ to \ infty \）の両方が、モデルの複雑さと有限の信号対雑音比の相互作用により、重要な理論的および計算的障害が生じることを示します。
最後に、低次元の設定とは異なり、単一のニュートンステップでは、高次元の問題で効果的に学習するには不十分であることを示しています。ただし、望ましい認定能力を実現するには2つのステップで十分です。
このアプローチの証明可能性と精度の主張をサポートするための数値実験を提供します。

要約(オリジナル)

Machine unlearning focuses on the computationally efficient removal of specific training data from trained models, ensuring that the influence of forgotten data is effectively eliminated without the need for full retraining. Despite advances in low-dimensional settings, where the number of parameters \( p \) is much smaller than the sample size \( n \), extending similar theoretical guarantees to high-dimensional regimes remains challenging. We propose an unlearning algorithm that starts from the original model parameters and performs a theory-guided sequence of Newton steps \( T \in \{ 1,2\}\). After this update, carefully scaled isotropic Laplacian noise is added to the estimate to ensure that any (potential) residual influence of forget data is completely removed. We show that when both \( n, p \to \infty \) with a fixed ratio \( n/p \), significant theoretical and computational obstacles arise due to the interplay between the complexity of the model and the finite signal-to-noise ratio. Finally, we show that, unlike in low-dimensional settings, a single Newton step is insufficient for effective unlearning in high-dimensional problems — however, two steps are enough to achieve the desired certifiebility. We provide numerical experiments to support the certifiability and accuracy claims of this approach.

arxiv情報

著者	Haolin Zou,Arnab Auddy,Yongchan Kwon,Kamiar Rahnama Rad,Arian Maleki
発行日	2025-05-12 15:11:13+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

Certified Data Removal Under High-dimensional Settings

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー