A Survey on Transformer Context Extension: Approaches and Evaluation

要約

変圧器に基づく大規模な言語モデル（LLMS）は、自然言語処理（NLP）の提出に広く適用されており、特に短いテキストタスクの処理において強力なパフォーマンスを示しています。
ただし、長いコンテキストシナリオに関しては、LLMSのパフォーマンスはいくつかの課題により分解されます。
この現象を軽減するために、最近提案された作業がたくさんあります。
この調査では、最初に、事前に訓練されたLLMを適用して長いコンテキストを処理するという課題をリストします。
次に、長いコンテキストに関連するアプローチを体系的に確認し、それらを4つの主要なタイプに分類する分類を提案します：位置エンコーディング、コンテキスト圧縮、検索拡張、および注意パターン。
アプローチに加えて、長いコンテキストの評価に焦点を当て、既存の長いコンテキストベンチマークに基づいて関連するデータ、タスク、メトリックを整理します。
最後に、長いコンテキストドメインに未解決の問題を要約し、将来の開発に関する見解を提案します。

要約(オリジナル)

Large language models (LLMs) based on Transformer have been widely applied in the filed of natural language processing (NLP), demonstrating strong performance, particularly in handling short text tasks. However, when it comes to long context scenarios, the performance of LLMs degrades due to some challenges. To alleviate this phenomenon, there is a number of work proposed recently. In this survey, we first list the challenges of applying pre-trained LLMs to process long contexts. Then systematically review the approaches related to long context and propose our taxonomy categorizing them into four main types: positional encoding, context compression, retrieval augmented, and attention pattern. In addition to the approaches, we focus on the evaluation of long context, organizing relevant data, tasks, and metrics based on existing long context benchmarks. Finally, we summarize unresolved issues in the long context domain and put forward our views on future developments.

arxiv情報

著者	Yijun Liu,Jinzheng Yu,Yang Xu,Zhongyang Li,Qingfu Zhu
発行日	2025-03-17 15:44:09+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

A Survey on Transformer Context Extension: Approaches and Evaluation

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー