WebGLM: Towards An Efficient Web-Enhanced Question Answering System with Human Preferences

要約

一般言語モデル (GLM) に基づいた Web 拡張質問応答システムである WebGLM を紹介します。
その目標は、事前トレーニングされた大規模言語モデル (LLM) を Web 検索および取得機能で強化しながら、実際の展開に効率的であるようにすることです。
これを達成するために、LLM 拡張レトリーバー、ブートストラップジェネレーター、および人間の好みを認識するスコアラーのための戦略を備えた WebGLM を開発します。
具体的には、WebGPT (OpenAI) の制限を特定して対処し、WebGLM の精度、効率、費用対効果の利点を実現します。
さらに、Web で強化された QA システムを評価するための体系的な基準を提案します。
私たちは多次元の人間評価と定量的アブレーション研究を実施しており、提案されている WebGLM 設計が既存のシステムよりも優れたパフォーマンスを示していることが示唆されています。
人間による評価では、100 億パラメータの GLM (10B) を備えた WebGLM は、同様のサイズの WebGPT (13B) よりも優れたパフォーマンスを発揮し、WebGPT (175B) と同等のパフォーマンスを示すことが示されています。
コード、デモ、データは \url{https://github.com/THUDM/WebGLM} にあります。

要約(オリジナル)

We present WebGLM, a web-enhanced question-answering system based on the General Language Model (GLM). Its goal is to augment a pre-trained large language model (LLM) with web search and retrieval capabilities while being efficient for real-world deployments. To achieve this, we develop WebGLM with strategies for the LLM-augmented retriever, bootstrapped generator, and human preference-aware scorer. Specifically, we identify and address the limitations of WebGPT (OpenAI), through which WebGLM is enabled with accuracy, efficiency, and cost-effectiveness advantages. In addition, we propose systematic criteria for evaluating web-enhanced QA systems. We conduct multi-dimensional human evaluation and quantitative ablation studies, which suggest the outperformance of the proposed WebGLM designs over existing systems. WebGLM with the 10-billion-parameter GLM (10B) is shown to perform better than the similar-sized WebGPT (13B) and even comparably to WebGPT (175B) in human evaluation. The code, demo, and data are at \url{https://github.com/THUDM/WebGLM}.

arxiv情報

著者	Xiao Liu,Hanyu Lai,Hao Yu,Yifan Xu,Aohan Zeng,Zhengxiao Du,Peng Zhang,Yuxiao Dong,Jie Tang
発行日	2023-06-13 16:57:53+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

WebGLM: Towards An Efficient Web-Enhanced Question Answering System with Human Preferences

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー