Parallel Key-Value Cache Fusion for Position Invariant RAG

要約

大規模言語モデル (LLM) の最近の進歩により、外部情報を活用するための検索拡張生成 (RAG) の必要性が強調されています。
ただし、LLM はコンテキスト内での関連情報の位置に敏感であり、そのような情報が中央に配置されると誤った応答を生成する傾向があり、これは「Lost in the Middle」現象として知られています。
このペーパーでは、入力コンテキストの順序に関係なく、デコーダのみのモデルに対して一貫した出力を生成するフレームワークを紹介します。
3 つのオープンドメイン質問応答タスクの実験結果は、RAG パイプラインの一般的なアプローチと比較して、モデルが入力コンテキストの順序に影響されない位置不変性と、無関係なパッセージに対する優れた堅牢性を示しています。

要約(オリジナル)

Recent advancements in Large Language Models (LLMs) underscore the necessity of Retrieval Augmented Generation (RAG) to leverage external information. However, LLMs are sensitive to the position of relevant information within contexts and tend to generate incorrect responses when such information is placed in the middle, known as `Lost in the Middle’ phenomenon. In this paper, we introduce a framework that generates consistent outputs for decoder-only models, irrespective of the input context order. Experimental results for three open domain question answering tasks demonstrate position invariance, where the model is not sensitive to input context order, and superior robustness to irrelevent passages compared to prevailing approaches for RAG pipelines.

arxiv情報

著者	Philhoon Oh,Jinwoo Shin,James Thorne
発行日	2025-01-13 17:50:30+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

Parallel Key-Value Cache Fusion for Position Invariant RAG

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー