SANSKRITI: A Comprehensive Benchmark for Evaluating Language Models’ Knowledge of Indian Culture

要約

言語モデル（LMS）は、現代のワークフローを形成する不可欠なツールですが、そのグローバルな有効性は、地元の社会文化的文脈を理解することに依存しています。
これに対処するために、言語モデルのインドの豊かな文化的多様性の理解を評価するために設計されたベンチマークであるSanskritiを紹介します。
28の州と8つの連合領土にまたがる21,853の細心の注意を払った質問回答ペアで構成されているサンスクリティは、インドの文化的知識をテストするための最大のデータセットです。
インド文化の16の重要な属性をカバーしています：儀式と儀式、歴史、観光、料理、ダンスと音楽、衣装、言語、芸術、フェスティバル、宗教、宗教、輸送、スポーツ、ナイトライフ、および個性は、インドの文化的タペストリの包括的な表現を提供します。
主要な大規模な言語モデル（LLMS）、インド言語モデル（ILMS）、および小言語モデル（SLM）でSanskritiを評価し、多くのモデルが地域固有のコンテキストで苦労していることで、文化的に微妙なクエリを処理する能力に大きな格差を明らかにします。
Sanskritiは、広範で文化的に豊かで多様なデータセットを提供することにより、LMSの文化的理解を評価および改善するための新しい基準を設定します。

要約(オリジナル)

Language Models (LMs) are indispensable tools shaping modern workflows, but their global effectiveness depends on understanding local socio-cultural contexts. To address this, we introduce SANSKRITI, a benchmark designed to evaluate language models’ comprehension of India’s rich cultural diversity. Comprising 21,853 meticulously curated question-answer pairs spanning 28 states and 8 union territories, SANSKRITI is the largest dataset for testing Indian cultural knowledge. It covers sixteen key attributes of Indian culture: rituals and ceremonies, history, tourism, cuisine, dance and music, costume, language, art, festivals, religion, medicine, transport, sports, nightlife, and personalities, providing a comprehensive representation of India’s cultural tapestry. We evaluate SANSKRITI on leading Large Language Models (LLMs), Indic Language Models (ILMs), and Small Language Models (SLMs), revealing significant disparities in their ability to handle culturally nuanced queries, with many models struggling in region-specific contexts. By offering an extensive, culturally rich, and diverse dataset, SANSKRITI sets a new standard for assessing and improving the cultural understanding of LMs.

arxiv情報

著者	Arijit Maji,Raghvendra Kumar,Akash Ghosh,Anushka,Sriparna Saha
発行日	2025-06-18 11:19:25+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

SANSKRITI: A Comprehensive Benchmark for Evaluating Language Models’ Knowledge of Indian Culture

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー