Kuwain 1.5B: An Arabic SLM via Language Injection

要約

新しい知識で既存のモデルを強化することは、AI開発の重要な側面です。
このペーパーでは、新しい言語を大規模な言語モデル（LLM）に統合するための新しい方法を紹介します。
私たちのアプローチは、事前知識を損なうことなく、以前に見えなかったターゲット言語を既存のLLMにうまく組み込んでいます。
アラビア語を主に英語で訓練した小さなオープンソースモデルに注入することにより、Kuwainという名前の15億パラメーターを備えた小さなモデルをトレーニングしました。
私たちの方法は、アラビア語のパフォーマンスの大幅な改善を示しており、さまざまなベンチマークで平均8％改善され、モデルの既存の知識を最小限の元のモデルのデータで保持しています。
これは、英語とアラビア語の両方で包括的なモデルをトレーニングするための費用対効果の高い代替品を提供します。
結果は、広範な再訓練またはリソース集約型プロセスなしで、効率的なターゲットを絞った言語モデルの拡張の可能性を強調しています。

要約(オリジナル)

Enhancing existing models with new knowledge is a crucial aspect of AI development. This paper introduces a novel method for integrating a new language into a large language model (LLM). Our approach successfully incorporates a previously unseen target language into an existing LLM without compromising its prior knowledge. We trained a tiny model with 1.5 billion parameters named Kuwain by injecting the Arabic language into a small open-source model mainly trained in English. Our method demonstrates significant improvements in Arabic language performance, with an average 8% improvement across various benchmarks, while retaining the model’s existing knowledge with a minimum amount of the original model’s data. This offers a cost-effective alternative to training a comprehensive model in both English and Arabic. The results highlight the potential for efficient, targeted language model expansion without extensive retraining or resource-intensive processes.

arxiv情報

著者	Khalil Hennara,Sara Chrouf,Mohamed Motaism Hamed,Zeina Aldallal,Omar Hadid,Safwan AlModhayan
発行日	2025-04-21 14:17:25+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

Kuwain 1.5B: An Arabic SLM via Language Injection

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー