Large Language Models are Demonstration Pre-Selectors for Themselves

要約

大規模な言語モデル（LLMS）を使用したコンテキスト内学習（ICL）は、トレーニングデータ全体から少数のショットデモを選択することにより、強力な少数のショットパフォーマンスを提供します。
ただし、類似性またはダイバーシティスコアに依存してデモンストレーションを選択する既存のICLメソッドは、各クエリの大規模なデータセットから繰り返し検索したため、高い計算コストが発生します。
この目的のために、特定のLLMSに合わせたトレーニングデータに最も代表的な例を含むデモンストレーションの代表的なサブセットを識別する新しいプレセレクションフレームワークである、フィーダー（まだ必然的なデモンストレーションプレセレクター）を提案します。
このサブセットを構築するために、前選択段階で「充足度」と「必要」メトリックを導入し、代表的な例を効率的に識別するためにツリーベースのアルゴリズムを設計します。
事前に選択されると、この代表的なサブセットは完全なトレーニングデータを効果的に置き換え、ICLで同等のパフォーマンスを維持しながら効率を向上させることができます。
さらに、事前に選択されたサブセットは、微調整LLMSにもメリットがあり、パフォーマンスを犠牲にすることなくトレーニング効率を高めるバイレベルの最適化方法を導入します。
300mから8Bのパラメーターの範囲のLLMSを使用した実験は、フィーダーがパフォーマンスを維持しながら、ICLのさまざまなダウンストリームデモンストレーション選択戦略とシームレスに統合しながら、トレーニングデータサイズを20％以上削減できることを示しています。

要約(オリジナル)

In-context learning (ICL) with large language models (LLMs) delivers strong few-shot performance by choosing few-shot demonstrations from the entire training data. However, existing ICL methods, which rely on similarity or diversity scores to choose demonstrations, incur high computational costs due to repeatedly retrieval from large-scale datasets for each query. To this end, we propose FEEDER (FEw yet Essential Demonstration prE-selectoR), a novel pre-selection framework that identifies a representative subset of demonstrations containing the most representative examples in the training data, tailored to specific LLMs. To construct this subset, we introduce the ‘sufficiency’ and ‘necessity’ metrics in the pre-selection stage and design a tree-based algorithm to identify representative examples efficiently. Once pre-selected, this representative subset can effectively replace the full training data, improving efficiency while maintaining comparable performance in ICL. Additionally, our pre-selected subset also benefits fine-tuning LLMs, where we introduce a bi-level optimization method that enhances training efficiency without sacrificing performance. Experiments with LLMs ranging from 300M to 8B parameters show that FEEDER can reduce training data size by over 20% while maintaining performance and seamlessly integrating with various downstream demonstration selection strategies in ICL.

arxiv情報

著者	Jiarui Jin,Yuwei Wu,Haoxuan Li,Xiaoting He,Weinan Zhang,Yiming Yang,Yong Yu,Jun Wang,Mengyue Yang
発行日	2025-06-06 12:29:03+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

Large Language Models are Demonstration Pre-Selectors for Themselves

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー