BayLing 2: A Multilingual Large Language Model with Efficient Language Alignment

要約

大規模言語モデル (LLM) は、強力な生成機能と膨大な知識を備えており、日常生活におけるさまざまなタスクを強化します。
ただし、これらの機能は主に高リソース言語に集中しており、低リソース言語の生成機能は弱く、知識は比較的限られています。
したがって、世界中の 100 以上の言語コミュニティにサービスを提供するには、LLM の多言語機能を強化することが重要です。
多言語機能を強化する直観的なアプローチは、さまざまな言語の指示データを構築することですが、100 を超える言語の指示データを構築するには、法外なコストがかかります。
この論文では、言語調整を通じて高リソース言語から低リソース言語に生成機能と知識を効率的に転送する BayLing 2 を紹介します。
これを達成するために、高リソース言語命令 (中国語と英語) と 100 以上の言語の言語間命令で構成される 320 万命令のデータセットを構築し、データセットに基づいて命令チューニングを実行して、言語間の機能移行を促進しました。
Llamaを基盤モデルとしてBayLing-2-7B、BayLing-2-13B、BayLing-2-8Bを開発し、BayLingの総合評価を実施しました。
100 以上の言語にわたる多言語翻訳において、BayLing は、同様の規模のオープンソースモデルと比較して優れたパフォーマンスを示します。
多言語の知識とベンチマークの理解に関して、BayLing は 20 以上の低リソース言語にわたって大幅な改善を達成し、高リソース言語から低リソース言語へ効果的に知識を伝達できる能力を実証しています。
さらに、英語のベンチマークの結果は、BayLing が高リソース言語で高いパフォーマンスを維持しながら、低リソース言語でのパフォーマンスを向上させていることを示しています。
BayLing のデモ、ホームページ、コード、モデルが利用可能です。

要約(オリジナル)

Large language models (LLMs), with their powerful generative capabilities and vast knowledge, empower various tasks in everyday life. However, these abilities are primarily concentrated in high-resource languages, leaving low-resource languages with weaker generative capabilities and relatively limited knowledge. Enhancing the multilingual capabilities of LLMs is therefore crucial for serving over 100 linguistic communities worldwide. An intuitive approach to enhance the multilingual capabilities would be to construct instruction data for various languages, but constructing instruction data for over 100 languages is prohibitively costly. In this paper, we introduce BayLing 2, which efficiently transfers generative capabilities and knowledge from high-resource languages to low-resource languages through language alignment. To achieve this, we constructed a dataset of 3.2 million instructions, comprising high-resource language instructions (Chinese and English) and cross-lingual instructions for 100+ languages and performed instruction tuning based on the dataset to facilitate the capability transfer between languages. Using Llama as the foundation model, we developed BayLing-2-7B, BayLing-2-13B, and BayLing-2-8B, and conducted a comprehensive evaluation of BayLing. For multilingual translation across 100+ languages, BayLing shows superior performance compared to open-source models of similar scale. For multilingual knowledge and understanding benchmarks, BayLing achieves significant improvements across over 20 low-resource languages, demonstrating its capability of effective knowledge transfer from high-resource to low-resource languages. Furthermore, results on English benchmarks indicate that BayLing maintains high performance in highresource languages while enhancing the performance in low-resource languages. Demo, homepage, code and models of BayLing are available.

arxiv情報

著者	Shaolei Zhang,Kehao Zhang,Qingkai Fang,Shoutao Guo,Yan Zhou,Xiaodong Liu,Yang Feng
発行日	2024-12-19 15:11:46+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

BayLing 2: A Multilingual Large Language Model with Efficient Language Alignment

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー