Are Structural Concepts Universal in Transformer Language Models? Towards Interpretable Cross-Lingual Generalization

要約

大規模言語モデル (LLM) は、多言語間の一般化能力を示しており、言語間で暗黙的に知識を伝達します。
ただし、移行はすべての言語、特に低リソースの言語で同様に成功するわけではなく、継続的な課題となっています。
言語を超えた暗黙的な一般化の限界に達しているかどうか、また明示的な知識の伝達が可能であるかどうかは不明です。
この論文では、言語間の一般化を強化するために、言語間の概念的な対応関係を明示的に調整する可能性を調査します。
言語の構文的側面をテストベッドとして使用して、43 言語を分析した結果、エンコーダ専用 LLM とデコーダ専用 LLM の両方について、各言語内の構造概念の空間間の高度な整合性が明らかになりました。
次に、異なる言語の概念的空間を調整する方法を学習するためのメタ学習ベースの方法を提案します。これにより、概念分類におけるゼロショットおよび少数ショットの一般化が容易になり、言語を超えた文脈内学習現象への洞察も得られます。
構文解析タスクの実験では、私たちのアプローチが最先端の方法で競争力のある結果を達成し、言語間のパフォーマンスの差を縮め、特にリソースが限られている言語に利益をもたらすことが示されました。

要約(オリジナル)

Large language models (LLMs) have exhibited considerable cross-lingual generalization abilities, whereby they implicitly transfer knowledge across languages. However, the transfer is not equally successful for all languages, especially for low-resource ones, which poses an ongoing challenge. It is unclear whether we have reached the limits of implicit cross-lingual generalization and if explicit knowledge transfer is viable. In this paper, we investigate the potential for explicitly aligning conceptual correspondence between languages to enhance cross-lingual generalization. Using the syntactic aspect of language as a testbed, our analyses of 43 languages reveal a high degree of alignability among the spaces of structural concepts within each language for both encoder-only and decoder-only LLMs. We then propose a meta-learning-based method to learn to align conceptual spaces of different languages, which facilitates zero-shot and few-shot generalization in concept classification and also offers insights into the cross-lingual in-context learning phenomenon. Experiments on syntactic analysis tasks show that our approach achieves competitive results with state-of-the-art methods and narrows the performance gap between languages, particularly benefiting those with limited resources.

arxiv情報

著者	Ningyu Xu,Qi Zhang,Jingting Ye,Menghan Zhang,Xuanjing Huang
発行日	2023-12-22 15:00:26+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

Are Structural Concepts Universal in Transformer Language Models? Towards Interpretable Cross-Lingual Generalization

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー