High-Dimensional Interlingual Representations of Large Language Models

要約

大規模な多言語データセットでトレーニングされた大規模な言語モデル（LLMS）は、双方向の構成要素の形成を示唆しています。これは、表現スペースの共有サブスペースです。
ただし、この現象に関する証拠は混合されているため、これらのモデルが統一された統一性表現を真に発達させるのか、それとも部分的に整列した構造を提示するのかは不明です。
リソースレベル、類型、および地理的地域でさまざまな31の多様な言語を探ります。
そして、多言語のLLMが一貫性のない横断的なアラインメントを示すことを発見します。
これに対処するために、共有されているセマンティックサブスペースと断片化されたコンポーネントの両方を識別するintlinglingual表現フレームワークを提案します。
高次元表現のローカル近隣構造を比較することにより、局所的なオーバーラップ（ILO）スコアを導入して、間隔間アラインメントを定量化します。
ILOを利用して、多言語LLMSの診療間表現に対する単一言語の微調整の影響を調査します。
我々の結果は、単一の言語でのみトレーニングが初期層のアラインメントを破壊し、これらの層を凍結することで、診断間表現の整列が保持され、横断的な一般化が改善されることを示しています。
これらの結果は、間隔間表現を評価するためのフレームワークとメトリックを検証し、スケーラブルな多言語学習には間隔間アラインメントが重要であることをさらに強調します。

要約(オリジナル)

Large language models (LLMs) trained on massive multilingual datasets hint at the formation of interlingual constructs–a shared subspace in the representation space. However, evidence regarding this phenomenon is mixed, leaving it unclear whether these models truly develop unified interlingual representations, or present a partially aligned constructs. We explore 31 diverse languages varying on their resource-levels, typologies, and geographical regions; and find that multilingual LLMs exhibit inconsistent cross-lingual alignments. To address this, we propose an interlingual representation framework identifying both the shared interlingual semantic subspace and fragmented components, existed due to representational limitations. We introduce Interlingual Local Overlap (ILO) score to quantify interlingual alignment by comparing the local neighborhood structures of high-dimensional representations. We utilize ILO to investigate the impact of single-language fine-tuning on the interlingual representations in multilingual LLMs. Our results indicate that training exclusively on a single language disrupts the alignment in early layers, while freezing these layers preserves the alignment of interlingual representations, leading to improved cross-lingual generalization. These results validate our framework and metric for evaluating interlingual representation, and further underscore that interlingual alignment is crucial for scalable multilingual learning.

arxiv情報

著者	Bryan Wilie,Samuel Cahyawijaya,Junxian He,Pascale Fung
発行日	2025-03-19 12:16:42+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

High-Dimensional Interlingual Representations of Large Language Models

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー