Invoke Interfaces Only When Needed: Adaptive Invocation for Large Language Models in Question Answering

要約

大小の言語モデル（LMS）の共同パラダイムは、パフォーマンスとコストのバランスを効果的にバランスさせますが、その極めて重要な課題は、幻覚が小さなLMSで発生したときの呼び出しの瞬間を正確に特定することにあります。
以前の最適化の取り組みは、主にLMSの推論プロセスとは別のポスト処理技術に焦点を当てており、高い計算コストと限られた有効性をもたらしました。
この論文では、Attenhscoreと呼ばれる実用的な呼び出し評価メトリックを提案します。これは、小さなLMSの生成プロセス中の幻覚の蓄積と伝播を計算し、潜在的な推論エラーを継続的に増幅します。
検出しきい値を動的に調整することにより、大きなLMSのより正確なリアルタイムの呼び出しを実現します。
さらに、小さなLMSの限られた推論能力を考慮すると、不確実性を認識した知識の再編成を活用して、異なるテキストチャンクから重要な情報をよりよく把握するのを支援します。
広範な実験により、私たちのAttenhscoreは、特に複雑なクエリに対処する場合、複数のQAデータセットでリアルタイムの幻覚検出機能を強化する際に、ほとんどのベースラインを上回ることが明らかになりました。
さらに、当社の戦略により、さまざまな変圧器ベースのLMSに適応するための追加のモデルトレーニングと柔軟性を表示する必要性がなくなります。

要約(オリジナル)

The collaborative paradigm of large and small language models (LMs) effectively balances performance and cost, yet its pivotal challenge lies in precisely pinpointing the moment of invocation when hallucinations arise in small LMs. Previous optimization efforts primarily focused on post-processing techniques, which were separate from the reasoning process of LMs, resulting in high computational costs and limited effectiveness. In this paper, we propose a practical invocation evaluation metric called AttenHScore, which calculates the accumulation and propagation of hallucinations during the generation process of small LMs, continuously amplifying potential reasoning errors. By dynamically adjusting the detection threshold, we achieve more accurate real-time invocation of large LMs. Additionally, considering the limited reasoning capacity of small LMs, we leverage uncertainty-aware knowledge reorganization to assist them better capture critical information from different text chunks. Extensive experiments reveal that our AttenHScore outperforms most baseline in enhancing real-time hallucination detection capabilities across multiple QA datasets, especially when addressing complex queries. Moreover, our strategies eliminate the need for additional model training and display flexibility in adapting to various transformer-based LMs.

arxiv情報

著者	Jihao Zhao,Chunlai Zhou,Biao Qin
発行日	2025-05-05 01:45:56+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

Invoke Interfaces Only When Needed: Adaptive Invocation for Large Language Models in Question Answering

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー