Are LLMs Really Not Knowledgable? Mining the Submerged Knowledge in LLMs’ Memory

要約

大規模言語モデル (LLM) は、潜在的な知識ベースとして有望であることが示されていますが、質問に答えるタスクに苦労することが多く、幻覚が見られる傾向があります。
以前の研究では、これらの問題はモデルのパラメーターの知識のギャップが原因であるとされていましたが、今回の調査では別の現象が明らかになりました。LLM は、誤った答えを生成した場合でも正しい知識を保持していることがよくあります。
モデルの内部表現の分析により、最終出力として選択されなかったにもかかわらず、確率の高いトークンの中に正解が頻繁に出現することがわかりました。
この観察に基づいて、表現の正確さに関係なく知識の保持を評価するための新しい指標である Hits@k を導入します。
私たちの広範な実験により、LLM は QA パフォーマンスが示唆するよりもはるかに多くの知識を保存していることが実証されました。
これらの発見に基づいて、検出されたものの表現されていない知識を活用して回答精度を向上させる手法である SkipUnsure を開発しました。
オープンドメインと特定ドメインの両方のデータセットでの実験では、モデルの再トレーニングを必要とせずに、DBPedia で最大 11.8%、IMDB で 6.3% の精度向上という一貫した改善が見られました。

要約(オリジナル)

Large language models (LLMs) have shown promise as potential knowledge bases, yet they often struggle with question-answering tasks and are prone to hallucinations. While previous research attributes these issues to knowledge gaps in the model’s parameters, our investigation reveals a different phenomenon: LLMs often retain correct knowledge even when generating incorrect answers. Through analysis of model’s internal representations, we find that correct answers frequently appear among high-probability tokens despite not being selected as final outputs. Based on this observation, we introduce Hits@k, a new metric to assess knowledge retention independent of expression accuracy. Our extensive experiments demonstrate that LLMs store significantly more knowledge than their QA performance suggests. Building on these findings, we develop SkipUnsure, a method to improve answer accuracy by leveraging detected but unexpressed knowledge. Experiments on both open-domain and specific-domain datasets show consistent improvements, with accuracy gains of up to 11.8% on DBPedia and 6.3% on IMDB, without requiring model retraining.

arxiv情報

著者	Xingjian Tao,Yiwei Wang,Yujun Cai,Zhicheng Yang,Jing Tang
発行日	2024-12-30 10:29:18+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

Are LLMs Really Not Knowledgable? Mining the Submerged Knowledge in LLMs’ Memory

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー