TemporalWiki: A Lifelong Benchmark for Training and Evaluating Ever-Evolving Language Models

要約

タイトル：TemporalWiki：常に進化する言語モデルのトレーニングと評価のための永遠のベンチマーク

要約：

– 言語モデル（LM）は世界が変化するにつれて時代遅れになり、トレーニング中に抜け落ちたり異なった最新の事実情報を必要とするタスクに失敗することがよくあります。この現象を時間的ミスアラインメントと呼びます。
– この問題は、Wikipediaなどの頻繁に更新される知識コーパスにLMの適応性を評価するための一貫したデータセットがまだ存在しないため、特に課題となっています。
– この問題に対処するために、我々は英語Wikipediaと英語Wikidataの間の連続するスナップショットの差をトレーニングと評価に利用する、常に進化する言語モデルの永遠のベンチマークであるTemporalWikiを紹介します。
– このベンチマークにより、研究者はLMの前の知識を保持し、常に更新/新しい知識を獲得できる能力を定期的に追跡できます。
– また、差分データによるLMのトレーニングによって、全スナップショットに対するトレーニングよりも12倍少ない計算時間で同様またはより優れた感度を実現することができることを発見しました。これにより、LMの事実知識を、限られたトレーニングデータを使用して、安全に更新できることが証明されました。
– データセットとコードは、https://github.com/joeljang/temporalwikiで利用可能です。

要約(オリジナル)

Language Models (LMs) become outdated as the world changes; they often fail to perform tasks requiring recent factual information which was absent or different during training, a phenomenon called temporal misalignment. This is especially a challenging problem because the research community still lacks a coherent dataset for assessing the adaptability of LMs to frequently-updated knowledge corpus such as Wikipedia. To this end, we introduce TemporalWiki, a lifelong benchmark for ever-evolving LMs that utilizes the difference between consecutive snapshots of English Wikipedia and English Wikidata for training and evaluation, respectively. The benchmark hence allows researchers to periodically track an LM’s ability to retain previous knowledge and acquire updated/new knowledge at each point in time. We also find that training an LM on the diff data through continual learning methods achieves similar or better perplexity than on the entire snapshot in our benchmark with 12 times less computational cost, which verifies that factual knowledge in LMs can be safely updated with minimal training data via continual learning. The dataset and the code are available at https://github.com/joeljang/temporalwiki.

arxiv情報

著者	Joel Jang,Seonghyeon Ye,Changho Lee,Sohee Yang,Joongbo Shin,Janghoon Han,Gyeonghun Kim,Minjoon Seo
発行日	2023-04-12 12:16:59+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, OpenAI

TemporalWiki: A Lifelong Benchmark for Training and Evaluating Ever-Evolving Language Models

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー