Entity Tracking in Language Models

要約

【タイトル】言語モデルにおけるエンティティの追跡
【要約】テキストや対話の展開に伴い、エンティティの状態や関係がどのように変化するかを追跡することは、意味理解における重要な前提条件である。しかし、大規模言語モデル（LLM）がディスコースエンティティを追跡する能力を詳細に調査した研究はほとんどない。本研究では、初期状態の英語表現と一連の状態変化操作が与えられた場合に、言語モデルがエンティティの最終状態をどの程度推測できるかを調べるタスクを提供する。このタスクを使用して、Flan-T5、GPT-3、GPT-3.5がエンティティの状態を追跡できるかどうかを調べ、コードの大量の事前学習を行ったGPT-3.5のみがこの能力を示すことがわかった。さらに、主にテキストに事前学習された小さなモデルがエンティティの追跡を学習できるかどうかを、いくつかのトレーニング/評価スプリットでT5を微調整することによって調査する。より複雑なスプリットになるとパフォーマンスが低下するが、トレーニングと評価の間にほとんど語彙的オーバーラップがないスプリットでも、微調整モデルはしばしば非自明なエンティティの追跡を行うことができた。これらの結果は、言語モデルがエンティティを追跡することを学習できるが、単に大きなテキストコーパスでの事前学習だけではこれらの能力が現れないことを示唆している。

要約(オリジナル)

Keeping track of how states and relations of entities change as a text or dialog unfolds is a key prerequisite to discourse understanding. Despite this fact, there have been few systematic investigations into the ability of large language models (LLMs) to track discourse entities. In this work, we present a task to probe to what extent a language model can infer the final state of an entity given an English description of the initial state and a series of state-changing operations. We use this task to first investigate whether Flan-T5, GPT-3 and GPT-3.5 can track the state of entities, and find that only GPT-3.5 models, which have been pretrained on large amounts of code, exhibit this ability. We then investigate whether smaller models pretrained primarily on text can learn to track entities, through finetuning T5 on several training/evaluation splits. While performance degrades for more complex splits, we find that even for splits with almost no lexical overlap between training and evaluation, a finetuned model can often perform non-trivial entity tracking. Taken together, these results suggest that language models can learn to track entities but pretraining on large text corpora alone does not make this capacity surface.

arxiv情報

著者	Najoung Kim,Sebastian Schuster
発行日	2023-05-03 18:01:13+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, OpenAI

Entity Tracking in Language Models

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー