ChatQA 2: Bridging the Gap to Proprietary LLMs in Long Context and RAG Capabilities

要約

この研究では、長いコンテキストの理解と検索拡張生成 (RAG) において、オープンアクセス LLM と主要な独自モデル (GPT-4-Turbo など) との間のギャップを埋めるように設計された Llama3 ベースのモデルである ChatQA 2 を紹介します。
能力。
これら 2 つの機能は、LLM が 1 つのプロンプトに収まらない大量の情報を処理するために不可欠であり、下流のタスクと計算予算に応じて相互に補完します。
Llama3-70B ベースのコンテキストウィンドウを 8K から 128K トークンに拡張するための詳細な継続トレーニングレシピと、モデルの命令追従、RAG パフォーマンス、および長期コンテキスト理解機能を強化する 3 段階の命令調整プロセスを紹介します。
。
私たちの結果は、Llama3-ChatQA-2-70B モデルが多くの長いコンテキスト理解タスクで GPT-4-Turbo-2024-0409 に匹敵する精度を達成し、RAG ベンチマークでそれを上回ることを示しています。
興味深いことに、最先端のロングコンテキスト取得機能により、RAG における上位 k コンテキストの断片化の問題が軽減され、ロングコンテキスト理解タスクに対する RAG ベースの結果がさらに改善されることがわかりました。
また、最先端のロングコンテキスト LLM を使用した、RAG ソリューションとロングコンテキストソリューション間の広範な比較も提供します。

要約(オリジナル)

In this work, we introduce ChatQA 2, a Llama3-based model designed to bridge the gap between open-access LLMs and leading proprietary models (e.g., GPT-4-Turbo) in long-context understanding and retrieval-augmented generation (RAG) capabilities. These two capabilities are essential for LLMs to process large volumes of information that cannot fit into a single prompt and are complementary to each other, depending on the downstream tasks and computational budgets. We present a detailed continued training recipe to extend the context window of Llama3-70B-base from 8K to 128K tokens, along with a three-stage instruction tuning process to enhance the model’s instruction-following, RAG performance, and long-context understanding capabilities. Our results demonstrate that the Llama3-ChatQA-2-70B model achieves accuracy comparable to GPT-4-Turbo-2024-0409 on many long-context understanding tasks and surpasses it on the RAG benchmark. Interestingly, we find that the state-of-the-art long-context retriever can alleviate the top-k context fragmentation issue in RAG, further improving RAG-based results for long-context understanding tasks. We also provide extensive comparisons between RAG and long-context solutions using state-of-the-art long-context LLMs.

arxiv情報

著者	Peng Xu,Wei Ping,Xianchao Wu,Zihan Liu,Mohammad Shoeybi,Bryan Catanzaro
発行日	2024-07-19 17:35:47+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

ChatQA 2: Bridging the Gap to Proprietary LLMs in Long Context and RAG Capabilities

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー