LLM Factoscope: Uncovering LLMs’ Factual Discernment through Inner States Analysis

要約

大規模言語モデル (LLM) は、広範な知識と創造的な能力によってさまざまな領域に革命をもたらしました。
ただし、LLM の重大な問題は、事実とは異なる出力を生成する傾向があることです。
この現象は、正確さが最優先される医療相談や法律相談などの機密性の高いアプリケーションで特に懸念されます。
この論文では、事実の検出に LLM の内部状態を活用する新しいシャムネットワークベースのモデルである LLM ファクトスコープを紹介します。
私たちの調査により、事実と非事実のコンテンツを生成する際の LLM の内部状態の識別可能なパターンが明らかになりました。
私たちは、さまざまなアーキテクチャにわたって LLM ファクトスコープの有効性を実証し、事実検出で 96% 以上の精度を達成しました。
私たちの研究は、LLM の内部状態を事実検出に利用するための新しい道を開き、信頼性と透明性を高めるために LLM の内部動作をさらに探求することを促進します。

要約(オリジナル)

Large Language Models (LLMs) have revolutionized various domains with extensive knowledge and creative capabilities. However, a critical issue with LLMs is their tendency to produce outputs that diverge from factual reality. This phenomenon is particularly concerning in sensitive applications such as medical consultation and legal advice, where accuracy is paramount. In this paper, we introduce the LLM factoscope, a novel Siamese network-based model that leverages the inner states of LLMs for factual detection. Our investigation reveals distinguishable patterns in LLMs’ inner states when generating factual versus non-factual content. We demonstrate the LLM factoscope’s effectiveness across various architectures, achieving over 96% accuracy in factual detection. Our work opens a new avenue for utilizing LLMs’ inner states for factual detection and encourages further exploration into LLMs’ inner workings for enhanced reliability and transparency.

arxiv情報

著者	Jinwen He,Yujia Gong,Kai Chen,Zijin Lin,Chengan Wei,Yue Zhao
発行日	2023-12-29 14:03:48+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

LLM Factoscope: Uncovering LLMs’ Factual Discernment through Inner States Analysis

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー