HICD: Hallucination-Inducing via Attention Dispersion for Contrastive Decoding to Mitigate Hallucinations in Large Language Models

要約

大規模な言語モデル（LLM）はしばしば幻覚を生成し、文脈的に不正確または事実上正しくない出力を生成します。
HICDを紹介します。HICDは、幻覚を緩和するために対照的な解読のために幻覚を誘導するために設計された新しい方法です。
既存のコントラストデコード方法とは異なり、HICDは、モデルの予測に重要なヘッドを選択し、ヘッドを誘導するために重要なヘッドを選択し、これらの誘導ヘッドの注意を分散させることにより幻覚を誘導し、幻覚化された出力と元の出力を比較して最終結果を得ます。
私たちのアプローチは、コンテキストの完了、読解、質問への回答など、コンテキストの忠実さを必要とするタスクのパフォーマンスを大幅に向上させます。
また、正確な知識リコールを必要とするタスクの事実性を向上させます。
私たちの誘導性ヘッドの選択と注意分散法は、コントラストのデコード、他の幻覚を誘発する方法を上回るためのより「コントラスト効果のある」幻覚につながることを実証します。
私たちの調査結果は、幻覚を制御された方法で誘導し、幅広いタスクでのLLMSのパフォーマンスを向上させることにより、幻覚を減らすための有望な戦略を提供します。

要約(オリジナル)

Large Language Models (LLMs) often generate hallucinations, producing outputs that are contextually inaccurate or factually incorrect. We introduce HICD, a novel method designed to induce hallucinations for contrastive decoding to mitigate hallucinations. Unlike existing contrastive decoding methods, HICD selects attention heads crucial to the model’s prediction as inducing heads, then induces hallucinations by dispersing attention of these inducing heads and compares the hallucinated outputs with the original outputs to obtain the final result. Our approach significantly improves performance on tasks requiring contextual faithfulness, such as context completion, reading comprehension, and question answering. It also improves factuality in tasks requiring accurate knowledge recall. We demonstrate that our inducing heads selection and attention dispersion method leads to more ‘contrast-effective’ hallucinations for contrastive decoding, outperforming other hallucination-inducing methods. Our findings provide a promising strategy for reducing hallucinations by inducing hallucinations in a controlled manner, enhancing the performance of LLMs in a wide range of tasks.

arxiv情報

著者	Xinyan Jiang,Hang Ye,Yongxin Zhu,Xiaoying Zheng,Zikang Chen,Jun Gong
発行日	2025-05-23 15:32:24+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

HICD: Hallucination-Inducing via Attention Dispersion for Contrastive Decoding to Mitigate Hallucinations in Large Language Models

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー