The Web Can Be Your Oyster for Improving Large Language Models

要約

大規模言語モデル (LLM) は、大量の世界知識をエンコードします。
ただし、そのような知識はモデルのトレーニング時に固定されるため、モデルは静的になり、その時点のトレーニングデータによって制限されます。
知識集約型タスクに対する LLM の能力をさらに向上させるために、検索エンジンを使用して大規模 Web で LLM を強化することを検討します。
以前の拡張ソース (Wikipedia データダンプなど) とは異なり、Web はより広範で包括的な、常に更新される情報を提供します。
このペーパーでは、統一されたテキスト対テキスト形式で 16 の知識集約型タスクにわたってトレーニングされる、Web 拡張 LLM UNIWEB を紹介します。
Web から取得したコンテンツを単に使用するのではなく、私たちのアプローチでは 2 つの大きな改善が加えられています。
まず、LLM の予測の信頼レベルを自己評価し、より多くのデータを求めて Web をいつ参照するかを適応的に決定できる適応型検索エンジン支援学習方法を提案します。これにより、Web からの無駄な拡張やノイズの多い拡張を回避できます。
次に、エンコードされた知識と取得された知識の間の不一致を減らすために、顕著なスパンの予測に基づいた事前トレーニングタスク、つまり継続的な知識学習を設計します。
幅広い知識集約型タスクの実験により、私たちのモデルが以前の検索拡張手法よりも大幅に優れていることがわかりました。

要約(オリジナル)

Large language models (LLMs) encode a large amount of world knowledge. However, as such knowledge is frozen at the time of model training, the models become static and limited by the training data at that time. In order to further improve the capacity of LLMs for knowledge-intensive tasks, we consider augmenting LLMs with the large-scale web using search engine. Unlike previous augmentation sources (e.g., Wikipedia data dump), the web provides broader, more comprehensive and constantly updated information. In this paper, we present a web-augmented LLM UNIWEB, which is trained over 16 knowledge-intensive tasks in a unified text-to-text format. Instead of simply using the retrieved contents from web, our approach has made two major improvements. Firstly, we propose an adaptive search engine assisted learning method that can self-evaluate the confidence level of LLM’s predictions, and adaptively determine when to refer to the web for more data, which can avoid useless or noisy augmentation from web. Secondly, we design a pretraining task, i.e., continual knowledge learning, based on salient spans prediction, to reduce the discrepancy between the encoded and retrieved knowledge. Experiments on a wide range of knowledge-intensive tasks show that our model significantly outperforms previous retrieval-augmented methods.

arxiv情報

著者	Junyi Li,Tianyi Tang,Wayne Xin Zhao,Jingyuan Wang,Jian-Yun Nie,Ji-Rong Wen
発行日	2023-05-18 14:20:32+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

The Web Can Be Your Oyster for Improving Large Language Models

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー