Likelihood as a Performance Gauge for Retrieval-Augmented Generation

要約

最近の研究では、大規模な言語モデルを使用した検索拡張生成は、コンテキスト内で検索されるドキュメントの順序によって影響を受ける傾向があることが判明しました。
ただし、詳細な分析が不足しているため、この現象を実際に迅速にエンジニアリングするために使用することは制限されます。
この研究では、尤度が言語モデルのパフォーマンスの効果的な尺度として機能すると仮定します。
さまざまな最先端の言語モデルを使用した 2 つの質問応答データセットの実験を通じて、コーパスレベルとインスタンスレベルの両方で、回答の精度と質問の可能性の間の相関関係を明らかにしました。
さらに、質問の可能性は、コンテキスト内でのタスク関連情報の位置も示す可能性があることがわかりました。
これらの発見に基づいて、パフォーマンスの向上につながるプロンプトを選択および構築するための尺度として質問の可能性を使用する 2 つの方法を提案します。
その有効性を実験で実証します。
さらに、尤度ベースの手法は、入力の尤度を計算するだけでよいため効率的であり、応答の生成が必要なヒューリスティックプロンプトエンジニアリング手法よりも必要な言語モデルのパスがはるかに少なくなります。
私たちの分析により、入力プロンプトがモデルのパフォーマンスにどのような影響を与えるかについての理解が深まり、効率的なプロンプトの最適化のための有望な方向性が得られます。

要約(オリジナル)

Recent work finds that retrieval-augmented generation with large language models is prone to be influenced by the order of retrieved documents in the context. However, the lack of in-depth analysis limits the use of this phenomenon for prompt engineering in practice. In this study, we posit that likelihoods serve as an effective gauge for language model performance. Through experiments on two question-answering datasets with a variety of state-of-the-art language models, we reveal correlations between answer accuracy and the likelihood of the question at both the corpus level and the instance level. In addition, we find that question likelihood can also indicate the position of the task-relevant information in the context. Based on these findings, we propose two methods that use question likelihood as a gauge for selecting and constructing prompts that lead to better performance. We demonstrate their effectiveness with experiments. In addition, our likelihood-based methods are efficient, as they only need to compute the likelihood of the input, requiring much fewer language model passes than heuristic prompt engineering methods that require generating responses. Our analysis deepens our understanding of how input prompts affect model performance and provides a promising direction for efficient prompt optimization.

arxiv情報

著者	Tianyu Liu,Jirui Qi,Paul He,Arianna Bisazza,Mrinmaya Sachan,Ryan Cotterell
発行日	2024-11-12 13:14:09+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

Likelihood as a Performance Gauge for Retrieval-Augmented Generation

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー