Evaluating statistical language models as pragmatic reasoners

要約

タイトル：「実用的な推論者としての統計的言語モデルの評価」
要約：
– コミュニケーションされた言語と意図された意味との関係は確率的で文脈に敏感である
– 「ベイジアン通信モデル」を用いることでこのマッピングを推定する方法があるが、さらに大規模言語モデル（LLMs）が自然言語から論理的表現を推論するために使用されるようになった
– 既存のLLMの研究では、ほとんどが文字通りの言語使用に制限されていたが、この研究では、LLMが「強い」という形容詞の閾値推定を文脈に基づいて行い、さらに質問に答える応用力を評価した
– LLMは、いくつかの複雑な実用的な発話の解釈に関して、人間と同様に文脈に基づく分布を導くことができ、その推論能力について言及できるが、否定との組成に苦労をする
– これらの結果は、統計的言語モデルの推論能力と、実用的な応用への使用についての情報を提供する。この研究で使用したコードはすべて、公式のGitHubリポジトリ(https://github.com/benlipkin/probsem/tree/CogSci2023)で公開されている。

要約(オリジナル)

The relationship between communicated language and intended meaning is often probabilistic and sensitive to context. Numerous strategies attempt to estimate such a mapping, often leveraging recursive Bayesian models of communication. In parallel, large language models (LLMs) have been increasingly applied to semantic parsing applications, tasked with inferring logical representations from natural language. While existing LLM explorations have been largely restricted to literal language use, in this work, we evaluate the capacity of LLMs to infer the meanings of pragmatic utterances. Specifically, we explore the case of threshold estimation on the gradable adjective “strong”, contextually conditioned on a strength prior, then extended to composition with qualification, negation, polarity inversion, and class comparison. We find that LLMs can derive context-grounded, human-like distributions over the interpretations of several complex pragmatic utterances, yet struggle composing with negation. These results inform the inferential capacity of statistical language models, and their use in pragmatic and semantic parsing applications. All corresponding code is made publicly available (https://github.com/benlipkin/probsem/tree/CogSci2023).

arxiv情報

著者	Benjamin Lipkin,Lionel Wong,Gabriel Grand,Joshua B Tenenbaum
発行日	2023-05-01 18:22:10+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, OpenAI

Evaluating statistical language models as pragmatic reasoners

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー