Steering into New Embedding Spaces: Analyzing Cross-Lingual Alignment Induced by Model Interventions in Multilingual Language Models

要約

言語間のアラインド表現は、多言語の大規模な言語モデル（MLLMS）で望ましいプロパティです。アラインメントは、横断的タスクのパフォーマンスを改善できるためです。
通常、アラインメントには、計算的に高価なモデルを微調整する必要があり、多くの場合利用できない可能性のあるかなりの言語データが必要です。
微調整に代わるデータ効率の高い代替は、モデル介入です。これは、モデルの活性化を操作して生成を望ましい方向に導く方法です。
MLLMの横断的表現のアライメントに対する一般的な介入（発見専門家）の効果を分析します。
特定の言語のために操作するニューロンを特定し、MLLMSの操作後および操作後の埋め込みスペースを内省します。
MLLMのアクティベーションを変更することで、埋め込みスペースが変化し、横断的なアラインメントが強化されるように変化することを示します。
さらに、埋め込みスペースの変化は、検索タスクでの下流のパフォーマンスの改善につながり、横断的検索の上位1精度が最大2倍改善されることを示しています。

要約(オリジナル)

Aligned representations across languages is a desired property in multilingual large language models (mLLMs), as alignment can improve performance in cross-lingual tasks. Typically alignment requires fine-tuning a model, which is computationally expensive, and sizable language data, which often may not be available. A data-efficient alternative to fine-tuning is model interventions — a method for manipulating model activations to steer generation into the desired direction. We analyze the effect of a popular intervention (finding experts) on the alignment of cross-lingual representations in mLLMs. We identify the neurons to manipulate for a given language and introspect the embedding space of mLLMs pre- and post-manipulation. We show that modifying the mLLM’s activations changes its embedding space such that cross-lingual alignment is enhanced. Further, we show that the changes to the embedding space translate into improved downstream performance on retrieval tasks, with up to 2x improvements in top-1 accuracy on cross-lingual retrieval.

arxiv情報

著者	Anirudh Sundar,Sinead Williamson,Katherine Metcalf,Barry-John Theobald,Skyler Seto,Masha Fedzechkina
発行日	2025-02-21 18:09:54+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

Steering into New Embedding Spaces: Analyzing Cross-Lingual Alignment Induced by Model Interventions in Multilingual Language Models

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー