Evaluating Morphological Compositional Generalization in Large Language Models

要約

大規模言語モデル (LLM) は、さまざまな自然言語の生成および理解タスクにおいて大幅な進歩を示しています。
しかし、その言語一般化能力には依然として疑問があり、これらのモデルが人間と同じように言語を学習するかどうかについて疑問が生じています。
人間は言語使用において構成的一般化と言語的創造性を示しますが、LLM がこれらの能力を、特に形態学においてどの程度再現するかは十分に研究されていません。
この研究では、構成性のレンズを通して LLM の形態学的一般化能力を体系的に調査します。
私たちは形態素を構成プリミティブとして定義し、形態学的生産性と体系性を評価するための新しい生成タスクと識別タスクのスイートを設計します。
トルコ語やフィンランド語などの膠着語に焦点を当て、GPT-4 や Gemini など、いくつかの最先端の指導用に微調整された多言語モデルを評価します。
私たちの分析によると、LLM は、特に新規の語根に適用される場合、形態学的構成の一般化に苦労し、形態学的複雑さが増加するにつれてパフォーマンスが急激に低下することがわかりました。
モデルは個々の形態学的組み合わせを偶然よりも正確に識別できますが、そのパフォーマンスには体系性が欠けており、人間と比較して精度に大きなギャップが生じます。

要約(オリジナル)

Large language models (LLMs) have demonstrated significant progress in various natural language generation and understanding tasks. However, their linguistic generalization capabilities remain questionable, raising doubts about whether these models learn language similarly to humans. While humans exhibit compositional generalization and linguistic creativity in language use, the extent to which LLMs replicate these abilities, particularly in morphology, is under-explored. In this work, we systematically investigate the morphological generalization abilities of LLMs through the lens of compositionality. We define morphemes as compositional primitives and design a novel suite of generative and discriminative tasks to assess morphological productivity and systematicity. Focusing on agglutinative languages such as Turkish and Finnish, we evaluate several state-of-the-art instruction-finetuned multilingual models, including GPT-4 and Gemini. Our analysis shows that LLMs struggle with morphological compositional generalization particularly when applied to novel word roots, with performance declining sharply as morphological complexity increases. While models can identify individual morphological combinations better than chance, their performance lacks systematicity, leading to significant accuracy gaps compared to humans.

arxiv情報

著者	Mete Ismayilzada,Defne Circi,Jonne Sälevä,Hale Sirin,Abdullatif Köksal,Bhuwan Dhingra,Antoine Bosselut,Lonneke van der Plas,Duygu Ataman
発行日	2024-10-16 15:17:20+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

Evaluating Morphological Compositional Generalization in Large Language Models

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー