Towards Long Context Hallucination Detection

要約

大規模な言語モデル（LLM）は、さまざまなタスクで顕著なパフォーマンスを実証しています。
しかし、それらは文脈的な幻覚を起こしやすく、特定の文脈と矛盾していない、または矛盾する情報を生成します。
多くの研究では、LLMSの文脈的幻覚を調査していますが、それらに対処することは長期にわたる問題のままです。
この作業では、長いコンテキストの幻覚検出のために特別に設計されたデータセットを構築することにより、この問題の解決に向けた最初の一歩を踏み出します。
さらに、BERTなどの事前に訓練されたエンコーダーモデルが長いコンテキストを処理し、分解と集約メカニズムを介してコンテキスト幻覚を効果的に検出できるようにする新しいアーキテクチャを提案します。
私たちの実験結果は、提案されたアーキテクチャが、さまざまなメトリックにわたってLLMベースのモデルと同様に、同様のサイズの以前のモデルを大幅に上回ると同時に、実質的により速い推論を提供することを示しています。

要約(オリジナル)

Large Language Models (LLMs) have demonstrated remarkable performance across various tasks. However, they are prone to contextual hallucination, generating information that is either unsubstantiated or contradictory to the given context. Although many studies have investigated contextual hallucinations in LLMs, addressing them in long-context inputs remains an open problem. In this work, we take an initial step toward solving this problem by constructing a dataset specifically designed for long-context hallucination detection. Furthermore, we propose a novel architecture that enables pre-trained encoder models, such as BERT, to process long contexts and effectively detect contextual hallucinations through a decomposition and aggregation mechanism. Our experimental results show that the proposed architecture significantly outperforms previous models of similar size as well as LLM-based models across various metrics, while providing substantially faster inference.

arxiv情報

著者	Siyi Liu,Kishaloy Halder,Zheng Qi,Wei Xiao,Nikolaos Pappas,Phu Mon Htut,Neha Anna John,Yassine Benajiba,Dan Roth
発行日	2025-04-28 03:47:05+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

Towards Long Context Hallucination Detection

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー