Hallucination Detection on a Budget: Efficient Bayesian Estimation of Semantic Entropy

要約

LLMが幻覚を見ているかどうかを検出することは、重要な研究課題である。そのための有望な方法の1つは、生成された配列の分布の意味エントロピー（Farquhar et al.我々はそのための新しいアルゴリズムを提案するが、これには2つの主な利点がある。第一に、ベイズ的アプローチをとることにより、LLMからのサンプルの与えられた予算に対して、意味エントロピーの推定の質が格段に向上する。第二に、「より難しい」文脈がより多くのサンプルを受け取るように、サンプルの数を適応的に調整することができる。AUROCによって測定された幻覚検出と同じ品質を達成するために、Farquharら（2024）が使用したサンプルの59％しか必要としない。さらに、非常に直感に反することだが、我々の推定器はLLMからたった1サンプルでも有効である。

要約(オリジナル)

Detecting whether an LLM hallucinates is an important research challenge. One promising way of doing so is to estimate the semantic entropy (Farquhar et al., 2024) of the distribution of generated sequences. We propose a new algorithm for doing that, with two main advantages. First, due to us taking the Bayesian approach, we achieve a much better quality of semantic entropy estimates for a given budget of samples from the LLM. Second, we are able to tune the number of samples adaptively so that `harder’ contexts receive more samples. We demonstrate empirically that our approach systematically beats the baselines, requiring only 59% of samples used by Farquhar et al. (2024) to achieve the same quality of hallucination detection as measured by AUROC. Moreover, quite counterintuitively, our estimator is useful even with just one sample from the LLM.

arxiv情報

著者	Kamil Ciosek,Nicolò Felicioni,Sina Ghiassian
発行日	2025-04-04 16:30:44+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, DeepL

Hallucination Detection on a Budget: Efficient Bayesian Estimation of Semantic Entropy

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー