Improving Causal Interventions in Amnesic Probing with Mean Projection or LEACE

要約

健忘環境は、モデルの挙動に関する特定の言語情報の影響を調べるために使用される手法です。
これには、関連情報を特定して削除し、主なタスクに関するモデルのパフォーマンスが変化するかどうかを評価することが含まれます。
削除された情報が関連する場合、モデルのパフォーマンスは低下するはずです。
このアプローチの難しさは、他の情報を変更せずにターゲット情報のみを削除することにあります。
広く使用されている除去技術である反復Nullspace投影（INLP）が、ターゲット情報を排除するときに表現にランダムな変更を導入することが示されています。
私たちは、平均投影（MP）とリース、2つの提案された代替案、よりターゲットを絞った方法で情報を削除し、それによって記憶喪失の調査を通じて行動の説明を得る可能性を高めることを示します。

要約(オリジナル)

Amnesic probing is a technique used to examine the influence of specific linguistic information on the behaviour of a model. This involves identifying and removing the relevant information and then assessing whether the model’s performance on the main task changes. If the removed information is relevant, the model’s performance should decline. The difficulty with this approach lies in removing only the target information while leaving other information unchanged. It has been shown that Iterative Nullspace Projection (INLP), a widely used removal technique, introduces random modifications to representations when eliminating target information. We demonstrate that Mean Projection (MP) and LEACE, two proposed alternatives, remove information in a more targeted manner, thereby enhancing the potential for obtaining behavioural explanations through Amnesic Probing.

arxiv情報

著者	Alicja Dobrzeniecka,Antske Fokkens,Pia Sommerauer
発行日	2025-06-13 11:07:14+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

Improving Causal Interventions in Amnesic Probing with Mean Projection or LEACE

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー