In-context Autoencoder for Context Compression in a Large Language Model

要約

私たちは、大規模言語モデル (LLM) の機能を活用して、長いコンテキストを短くコンパクトなメモリスロットに圧縮し、さまざまな目的で LLM によって直接条件付けできるインコンテキストオートエンコーダ (ICAE) を提案します。
ICAE はまず、大量のテキストデータに対して自動エンコーディングと言語モデリングの両方を使用して事前トレーニングされ、元のコンテキストを正確かつ包括的に表現するメモリスロットを生成できるようになります。
次に、さまざまなプロンプトに対して望ましい応答を生成するための指示データが微調整されます。
実験では、約 1% の追加パラメータを導入した軽量 ICAE が、Llama に基づく $4\times$ のコンテキスト圧縮を効果的に達成し、推論中のレイテンシと GPU メモリコストの両方の改善という利点を提供し、記憶と潜在力における興味深い洞察を示していることを示しています。
スケーラビリティのために。
これらの有望な結果は、認知科学における作業記憶とLLMにおける表現学習との関係についての新たな視点を示唆しており、長いコンテキスト問題への対処におけるICAEの重要な意味を明らかにし、LLMコンテキスト管理におけるさらなる研究を示唆している。
データ、コード、モデルは https://github.com/getao/icae で入手できます。

要約(オリジナル)

We propose the In-context Autoencoder (ICAE), leveraging the power of a large language models (LLM) to compress a long context into short compact memory slots that can be directly conditioned on by the LLM for various purposes. ICAE is first pretrained using both autoencoding and language modeling objectives on massive text data, enabling it to generate memory slots that accurately and comprehensively represent the original context; Then, it is fine-tuned on instruction data for producing desirable responses to various prompts. Experiments demonstrate that our lightweight ICAE, introducing about 1% additional parameters, effectively achieves $4\times$ context compression based on Llama, offering advantages in both improved latency and GPU memory cost during inference, and showing an interesting insight in memorization as well as potential for scalability. These promising results imply a novel perspective on the connection between working memory in cognitive science and representation learning in LLMs, revealing ICAE’s significant implications in addressing the long context problem and suggesting further research in LLM context management. Our data, code and models are available at https://github.com/getao/icae.

arxiv情報

著者	Tao Ge,Jing Hu,Lei Wang,Xun Wang,Si-Qing Chen,Furu Wei
発行日	2024-03-18 00:45:48+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

In-context Autoencoder for Context Compression in a Large Language Model

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー