CamelEval: Advancing Culturally Aligned Arabic Language Models and Benchmarks

要約

大規模言語モデル (LLM) は、現代の人工知能システムの基礎です。
この文書では、アラビア語話者の価値観や好みに合わせて特別に設計されたアラビア語と英語のバイリンガル LLM である Juhaina について紹介します。
Juhaina は本質的に、指示のフォロー、自由回答形式の質問応答、情報提供、テキスト処理などの高度な機能をサポートしています。
私たちのモデルには 92 億 4,000 万個のパラメーターが含まれており、最大 8,192 トークンのコンテキストウィンドウでトレーニングされます。
この論文では、Juhaina の作成プロセスを詳しく説明し、広範な実証的評価を提供します。
さらに、広く採用されている Open Arabic LLM Leaderboard (OALL) の限界を特定し、新しい評価ベンチマーク CamelEval を提案します。
私たちの調査結果は、アラビア語で有益な応答を生成し、地域に関する事実に正確な情報を提供し、微妙な文化的側面を理解する点で、ジュハイナがラマ家族やジェマ家族などの同等の規模の既存のLLMを上回っていることを示しています。
私たちは、Juhaina が最先端の AI テクノロジーを民主化し、4 億人以上のアラビア語話者にサービスを提供し、アラビア語でコミュニケーションを図るだけでなく、彼らの文化を理解する LLM を提供することを目指しています。
すべてのモデルは Huggingface \url{https://huggingface.co/elmrc} で公開されています。

要約(オリジナル)

Large Language Models (LLMs) are the cornerstones of modern artificial intelligence systems. This paper introduces Juhaina, a Arabic-English bilingual LLM specifically designed to align with the values and preferences of Arabic speakers. Juhaina inherently supports advanced functionalities such as instruction following, open-ended question answering, information provisioning, and text processing. Our model contains 9.24 billion parameters and is trained on a context window of up to 8,192 tokens. This paper details the creation process of Juhaina and provides an extensive empirical evaluation. Furthermore, we identify the limitations of widely-adopted Open Arabic LLM Leaderboard (OALL) and propose a new evaluation benchmark, CamelEval. Our findings demonstrate that Juhaina surpasses existing LLMs of comparable sizes, such as the Llama and Gemma families, in generating helpful responses in Arabic, providing factually accurate information about the region, and understanding nuanced cultural aspects. We aspire for Juhaina to democratize cutting-edge AI technologies, serving over 400 million Arabic speakers by offering LLMs that not only communicate in their language but also comprehend their culture. We publicly release all models on Huggingface \url{https://huggingface.co/elmrc}.

arxiv情報

著者	Zhaozhi Qian,Faroq Altam,Muhammad Alqurishi,Riad Souissi
発行日	2024-09-24 08:49:21+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

CamelEval: Advancing Culturally Aligned Arabic Language Models and Benchmarks

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー