Cultural Fidelity in Large-Language Models: An Evaluation of Online Language Resources as a Driver of Model Performance in Value Representation

要約

LLM のトレーニングデータには社会的価値観が組み込まれており、LLM はその言語の文化に精通しやすくなります。
私たちの分析では、世界価値観調査によって測定された、GPT-4o の国の社会的価値観を反映する能力の差異の 44% が、その言語でのデジタルリソースの利用可能性と相関していることがわかりました。
特に、リソースが最も低い言語では、リソースが最も高い言語と比較して、エラー率が 5 倍以上高かった。
GPT-4-turbo の場合、この相関関係は 72% に上昇し、Web スクレイピングデータを超えて英語以外の言語への親しみやすさを向上させる努力を示唆しています。
私たちの研究では、21 の国と言語のペアを含む、このトピック分野で最大かつ最も堅牢なデータセットの 1 つを開発しました。各ペアには、ネイティブスピーカーによって検証された 94 の調査質問が含まれています。
私たちの結果は、LLM のパフォーマンスとターゲット言語でのデジタルデータの可用性との間の関連性を強調しています。
リソースの少ない言語、特にグローバル・サウスで顕著なパフォーマンスの低下は、デジタル格差を悪化させる可能性があります。
アフリカ言語の取り組みで見られるように、多言語LLMをゼロから開発することや、多様な言語データセットの微調整を強化することなど、これに対処するために提案された戦略について議論します。

要約(オリジナル)

The training data for LLMs embeds societal values, increasing their familiarity with the language’s culture. Our analysis found that 44% of the variance in the ability of GPT-4o to reflect the societal values of a country, as measured by the World Values Survey, correlates with the availability of digital resources in that language. Notably, the error rate was more than five times higher for the languages of the lowest resource compared to the languages of the highest resource. For GPT-4-turbo, this correlation rose to 72%, suggesting efforts to improve the familiarity with the non-English language beyond the web-scraped data. Our study developed one of the largest and most robust datasets in this topic area with 21 country-language pairs, each of which contain 94 survey questions verified by native speakers. Our results highlight the link between LLM performance and digital data availability in target languages. Weaker performance in low-resource languages, especially prominent in the Global South, may worsen digital divides. We discuss strategies proposed to address this, including developing multilingual LLMs from the ground up and enhancing fine-tuning on diverse linguistic datasets, as seen in African language initiatives.

arxiv情報

著者	Sharif Kazemi,Gloria Gerhardt,Jonty Katz,Caroline Ida Kuria,Estelle Pan,Umang Prabhakar
発行日	2024-10-14 13:33:00+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

Cultural Fidelity in Large-Language Models: An Evaluation of Online Language Resources as a Driver of Model Performance in Value Representation

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー