When the LM misunderstood the human chuckled: Analyzing garden path effects in humans and language models

要約

現代の大規模な言語モデル（LLM）は、多くの言語タスクで人間のような能力を示しており、LLMと人間の言語処理を比較することに関心を呼び起こしています。
この論文では、人間にとって挑戦的であることで有名なGarden-Path Constructionを使用して、文の理解タスクに関する2つの詳細な比較を実施しています。
心理言語学の研究に基づいて、ガーデンパスの文が難しい理由に関する仮説を定式化し、理解の質問を使用して、人間の参加者とLLMの大規模なスイートに関するこれらの仮説をテストします。
私たちの発見は、LLMと人間の両方が特定の構文の複雑さと格闘しており、一部のモデルは人間の理解と高い相関を示していることを明らかにしています。
調査結果を補完するために、庭のパス構造のLLMの理解を言い換え、テキストからイメージまでの生成タスクをテストし、結果が文の理解の疑問の結果を反映していることを発見し、これらの構造のLLM理解に関する調査結果をさらに検証します。

要約(オリジナル)

Modern Large Language Models (LLMs) have shown human-like abilities in many language tasks, sparking interest in comparing LLMs’ and humans’ language processing. In this paper, we conduct a detailed comparison of the two on a sentence comprehension task using garden-path constructions, which are notoriously challenging for humans. Based on psycholinguistic research, we formulate hypotheses on why garden-path sentences are hard, and test these hypotheses on human participants and a large suite of LLMs using comprehension questions. Our findings reveal that both LLMs and humans struggle with specific syntactic complexities, with some models showing high correlation with human comprehension. To complement our findings, we test LLM comprehension of garden-path constructions with paraphrasing and text-to-image generation tasks, and find that the results mirror the sentence comprehension question results, further validating our findings on LLM understanding of these constructions.

arxiv情報

著者	Samuel Joseph Amouyal,Aya Meltzer-Asscher,Jonathan Berant
発行日	2025-02-13 13:19:33+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

When the LM misunderstood the human chuckled: Analyzing garden path effects in humans and language models

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー