Ever: Mitigating Hallucination in Large Language Models through Real-Time Verification and Rectification

要約

大規模言語モデル (LLM) は、流暢なテキストを生成する際に顕著な熟練度を示しています。
ただし、不正確なコンテンツや幻覚のようなコンテンツを生成するという課題に遭遇することがよくあります。
この問題は、非検索ベースの生成アプローチと検索拡張生成アプローチの両方に共通しており、既存のポストホック修正手法では、特に推論タスクにおいて、「雪だるま式」問題によって引き起こされる可能性がある蓄積された幻覚エラーに対処できない可能性があります。
これらの課題に取り組むために、リアルタイム検証と修正 (Ever) と呼ばれる新しいアプローチを導入します。
Ever では、幻覚を修正するために生成プロセスが終了するまで待つのではなく、リアルタイムで段階的な生成と幻覚修正戦略を採用しています。
主な目的は、テキスト生成プロセス中に発生する幻覚を検出して修正することです。
検索ベースのベースラインと非検索ベースのベースラインの両方と比較すると、Ever は、短形式の QA、伝記の生成、マルチホップ推論などのさまざまなタスクにわたって、信頼できる事実に正確なテキストの生成において大幅な向上を示しています。

要約(オリジナル)

Large Language Models (LLMs) have demonstrated remarkable proficiency in generating fluent text. However, they often encounter the challenge of generating inaccurate or hallucinated content. This issue is common in both non-retrieval-based generation and retrieval-augmented generation approaches, and existing post-hoc rectification methods may not address the accumulated hallucination errors that may be caused by the ‘snowballing’ issue, especially in reasoning tasks. To tackle these challenges, we introduce a novel approach called Real-time Verification and Rectification (Ever). Instead of waiting until the end of the generation process to rectify hallucinations, Ever employs a real-time, step-wise generation and hallucination rectification strategy. The primary objective is to detect and rectify hallucinations as they occur during the text generation process. When compared to both retrieval-based and non-retrieval-based baselines, Ever demonstrates a significant improvement in generating trustworthy and factually accurate text across a diverse range of tasks, including short-form QA, biography generation, and multi-hop reasoning.

arxiv情報

著者	Haoqiang Kang,Juntong Ni,Huaxiu Yao
発行日	2023-11-15 17:04:56+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

Ever: Mitigating Hallucination in Large Language Models through Real-Time Verification and Rectification

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー