Bayesian Multi-Task Transfer Learning for Soft Prompt Tuning

要約

プロンプトチューニングは、モデルパラメーター全体を微調整するのではなく、大規模な事前トレーニング済み言語モデルを下流タスクに適応させるようにプロンプトを最適化するもので、プロンプトがマルチタスク転移学習でトレーニングされる場合に特に効果的であることが示されています。
設定。
これらの方法では通常、各ソースタスクのプロンプトを個別にトレーニングし、それらを集約してターゲットタスクのプロンプトを初期化することが必要になります。
ただし、このアプローチは、ソースタスクの一部が互いに否定的または肯定的に干渉する可能性があるという事実を決定的に無視しています。
私たちは、トレーニングソースプロンプトを介してソースタスクから知識を抽出する場合、ターゲットタスクへの適切な転送のために、ソースタスク間のこの相関関係を考慮する必要があると主張します。
この目的を達成するために、ソースタスク全体にわたるプロンプトの事後分布を扱うベイジアンアプローチを提案します。
スタイン変分勾配降下法を利用して事後分析からサンプルに対応する代表的なソースプロンプトを取得し、それらを集約して最初のターゲットプロンプトを構成します。
標準ベンチマーク NLP タスクに関する広範な実験結果を示します。ベイジアンマルチタスク転移学習アプローチは、多くの設定で最先端の手法よりも優れています。
さらに、私たちのアプローチはプロンプト自体以外の補助モデルを必要とせず、高度なパラメーター効率を実現します。

要約(オリジナル)

Prompt tuning, in which prompts are optimized to adapt large-scale pre-trained language models to downstream tasks instead of fine-tuning the full model parameters, has been shown to be particularly effective when the prompts are trained in a multi-task transfer learning setting. These methods generally involve individually training prompts for each source task and then aggregating them to provide the initialization of the prompt for the target task. However, this approach critically ignores the fact that some of the source tasks could be negatively or positively interfering with each other. We argue that when we extract knowledge from source tasks via training source prompts, we need to consider this correlation among source tasks for better transfer to target tasks. To this end, we propose a Bayesian approach where we work with the posterior distribution of prompts across source tasks. We obtain representative source prompts corresponding to the samples from the posterior utilizing Stein Variational Gradient Descent, which are then aggregated to constitute the initial target prompt. We show extensive experimental results on the standard benchmark NLP tasks, where our Bayesian multi-task transfer learning approach outperforms the state-of-the-art methods in many settings. Furthermore, our approach requires no auxiliary models other than the prompt itself, achieving a high degree of parameter efficiency.

arxiv情報

著者	Haeju Lee,Minchan Jeong,Se-Young Yun,Kee-Eung Kim
発行日	2024-02-13 16:57:02+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

Bayesian Multi-Task Transfer Learning for Soft Prompt Tuning

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー