GUARD: Guided Unlearning and Retention via Data Attribution for Large Language Models

要約

大規模な言語モデル（LLMS）での学習は、規制のコンプライアンス、著作権保護、プライバシーの懸念により、ますます重要になっています。
ただし、LLMの未学習の重要な課題は意図しない忘却です。特定のデータの削除は、モデルの有用性と貴重な望ましい情報の保持を不注意に損なうことです。
以前の作業は主に建築革新に焦点を当てていますが、データレベルの要因が学習のパフォーマンスを解き放つことに及ぼす影響は依然として存在していません。
その結果、既存の方法は、衝撃的なデータを忘れたときに劣化した保持に苦しむことがよくあります。
これに対処するために、Guard-a Data Attributionを介したガイド付きの維持と保持のためのGuard-A新しいフレームワークを提案します。
その中心で、ガードは、LLMの学習に合わせて調整された軽量プロキシデータ属性メトリックを導入します。
これに基づいて、私たちは、プロキシ属性スコアに逆に比例して、適応的で不均一な不均一な未発表の重みをサンプルに割り当てる新しい未学習目標を設計します。
このような学習力の再割り当てを通じて、ガードは保持における意図しない損失を軽減します。
ガードは、以前の方法に匹敵する忘れたメトリックを維持しながら、保持を大幅に強化する厳格な理論的保証を提供します。
複数のLLMアーキテクチャにわたる豆腐ベンチマークに関する広範な実験は、効果的な学習を確保しながら、有用性の保存を大幅に改善することを示しています。
特に、Guardは、トレーニングデータの10％を忘れた場合、真実の比率で最大194.92％の保持セットでユーティリティの犠牲を減らします。

要約(オリジナル)

Unlearning in large language models (LLMs) is becoming increasingly important due to regulatory compliance, copyright protection, and privacy concerns. However, a key challenge in LLM unlearning is unintended forgetting, where the removal of specific data inadvertently impairs the utility of the model and its retention of valuable, desired information. While prior work has primarily focused on architectural innovations, the influence of data-level factors on unlearning performance remains underexplored. As a result, existing methods often suffer from degraded retention when forgetting high-impact data. To address this, we propose GUARD-a novel framework for Guided Unlearning And Retention via Data attribution. At its core, GUARD introduces a lightweight proxy data attribution metric tailored for LLM unlearning, which quantifies the ‘alignment’ between the forget and retain sets while remaining computationally efficient. Building on this, we design a novel unlearning objective that assigns adaptive, nonuniform unlearning weights to samples, inversely proportional to their proxy attribution scores. Through such a reallocation of unlearning power, GUARD mitigates unintended losses in retention. We provide rigorous theoretical guarantees that GUARD significantly enhances retention while maintaining forgetting metrics comparable to prior methods. Extensive experiments on the TOFU benchmark across multiple LLM architectures demonstrate that GUARD substantially improves utility preservation while ensuring effective unlearning. Notably, GUARD reduces utility sacrifice on the Retain Set by up to 194.92% in terms of Truth Ratio when forgetting 10% of the training data.

arxiv情報

著者	Evelyn Ma,Duo Zhou,Peizhi Niu,Huiting Zhou,Huan Zhang,Olgica Milenkovic,S. Rasoul Etesami
発行日	2025-06-12 17:49:09+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

GUARD: Guided Unlearning and Retention via Data Attribution for Large Language Models

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー