One Token Can Help! Learning Scalable and Pluggable Virtual Tokens for Retrieval-Augmented Large Language Models

要約

検索拡張生成 (RAG) は、より事実に基づいた正確な最新のコンテンツを生成するために大規模言語モデル (LLM) を改善する有望な方法です。
既存の方法では、取得した情報を活用するように LLM をガイドするプロンプトを最適化するか、RAG シナリオに適応するように LLM を直接微調整します。
微調整するとパフォーマンスが向上しますが、パラメーターを変更することにより、LLM の一般的な生成機能が損なわれることがよくあります。
この制限は、パラメータ調整が元の機能に影響を与える可能性があるため、特に LLM がすでに展開されている場合、実際のアプリケーションで課題を引き起こします。
これに対処するために、RAG のスケーラブルでプラグイン可能な仮想トークンを学習する新しい方法を提案します。
LLM の元のパラメータを維持し、これらのプラガブルトークンの埋め込みのみを微調整することにより、私たちのアプローチは LLM のパフォーマンスを向上させるだけでなく、LLM の一般的な生成機能も維持します。
さらに、メソッドの拡張性、柔軟性、一般化性を向上させるために、いくつかのトレーニング戦略を設計します。
12 の質問応答タスクにわたる包括的な実験により、私たちのアプローチの優位性が実証されました。

要約(オリジナル)

Retrieval-augmented generation (RAG) is a promising way to improve large language models (LLMs) for generating more factual, accurate, and up-to-date content. Existing methods either optimize prompts to guide LLMs in leveraging retrieved information or directly fine-tune LLMs to adapt to RAG scenarios. Although fine-tuning can yield better performance, it often compromises the LLMs’ general generation capabilities by modifying their parameters. This limitation poses challenges in practical applications, especially when LLMs are already deployed, as parameter adjustments may affect their original functionality. To address this, we propose a novel method that involves learning scalable and pluggable virtual tokens for RAG. By maintaining the LLMs’ original parameters and fine-tuning only the embeddings of these pluggable tokens, our approach not only enhances LLMs’ performance but also preserves their general generation capabilities. Furthermore, we design several training strategies to improve the scalability, flexibility, and generalizability of our method. Comprehensive experiments across 12 question-answering tasks demonstrate the superiority of our approach.

arxiv情報

著者	Yutao Zhu,Zhaoheng Huang,Zhicheng Dou,Ji-Rong Wen
発行日	2024-12-11 10:56:03+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

One Token Can Help! Learning Scalable and Pluggable Virtual Tokens for Retrieval-Augmented Large Language Models

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー