Unveiling the Pitfalls of Knowledge Editing for Large Language Models

要約

大規模言語モデル(LLM)の微調整にかかるコストが上昇を続ける中、最近の研究努力はLLMに埋め込まれた暗黙知を編集する方法論の開発に重点を置いている。しかし、知識編集がバタフライ効果を引き起こすのではないかという暗雲が頭上に立ち込めている。本論文は、LLMの知識編集に関連する潜在的な落とし穴を調査するパイオニアである。そのために、新しいベンチマークデータセットを導入し、革新的な評価指標を提案する。我々の結果は、以下の2つの重要な懸念を強調している：(1)知識の衝突：論理的に衝突する事実群の編集は、LLMに内在する矛盾を拡大する可能性がある。(2) 知識の歪み：事実知識を編集する目的でパラメータを変更すると、LLMの生来の知識構造に取り返しのつかない歪みが生じる可能性がある。実験結果は、知識編集がLLMに不注意に意図しない結果をもたらす可能性があることを鮮明に示しており、今後の研究において注意と努力が必要である。コードはhttps://github.com/zjunlp/PitfallsKnowledgeEditing。

要約(オリジナル)

As the cost associated with fine-tuning Large Language Models (LLMs) continues to rise, recent research efforts have pivoted towards developing methodologies to edit implicit knowledge embedded within LLMs. Yet, there’s still a dark cloud lingering overhead — will knowledge editing trigger butterfly effect? since it is still unclear whether knowledge editing might introduce side effects that pose potential risks or not. This paper pioneers the investigation into the potential pitfalls associated with knowledge editing for LLMs. To achieve this, we introduce new benchmark datasets and propose innovative evaluation metrics. Our results underline two pivotal concerns: (1) Knowledge Conflict: Editing groups of facts that logically clash can magnify the inherent inconsistencies in LLMs-a facet neglected by previous methods. (2) Knowledge Distortion: Altering parameters with the aim of editing factual knowledge can irrevocably warp the innate knowledge structure of LLMs. Experimental results vividly demonstrate that knowledge editing might inadvertently cast a shadow of unintended consequences on LLMs, which warrant attention and efforts for future works. Code will be released at https://github.com/zjunlp/PitfallsKnowledgeEditing.

arxiv情報

著者	Zhoubo Li,Ningyu Zhang,Yunzhi Yao,Mengru Wang,Xi Chen,Huajun Chen
発行日	2023-10-03 15:10:46+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, DeepL

Unveiling the Pitfalls of Knowledge Editing for Large Language Models

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー