Prompt Optimization via Adversarial In-Context Learning

要約

我々は、1 つの LLM をジェネレーターとして、もう 1 つを弁別子として、そして 3 つ目をプロンプト修飾子として採用することにより、インコンテキスト学習 (ICL) のプロンプトを最適化する新しい方法である敵対的インコンテキスト学習 (adv-ICL) を提案します。
従来の敵対的学習と同様に、adv-ICL はジェネレーターとディスクリミネーターの間の 2 人用ゲームとして実装され、ジェネレーターはディスクリミネーターを騙すのに十分な現実的な出力を生成しようとします。
各ラウンドでは、タスク命令といくつかのイグザンプラが接頭辞として付けられた入力が与えられると、ジェネレーターは出力を生成します。
次に、ディスクリミネーターは、ジェネレーターの入出力ペアをモデル生成データまたは実際のデータとして分類するタスクを負います。
弁別器損失に基づいて、プロンプト修飾子はジェネレーターおよび弁別器プロンプトに対する可能な編集を提案し、敵対的損失を最も改善する編集が選択されます。
私たちは、adv-ICL が、要約、算術推論、機械翻訳、データからテキストへの生成などの 11 の生成および分類タスクにおいて、オープンソースモデルとクローズドソースモデルの両方において、最先端のプロンプト最適化手法に比べて大幅な改善をもたらすことを示します。
、MMLU およびビッグベンチのハードベンチマーク。
さらに、私たちの方法は事前トレーニングされたモデルを使用し、モデルパラメーターではなくプロンプトのみを更新するため、計算効率が高く、あらゆる LLM やタスクに拡張しやすく、リソースが少ない設定でも効果的です。

要約(オリジナル)

We propose a new method, Adversarial In-Context Learning (adv-ICL), to optimize prompt for in-context learning (ICL) by employing one LLM as a generator, another as a discriminator, and a third as a prompt modifier. As in traditional adversarial learning, adv-ICL is implemented as a two-player game between the generator and discriminator, where the generator tries to generate realistic enough output to fool the discriminator. In each round, given an input prefixed by task instructions and several exemplars, the generator produces an output. The discriminator is then tasked with classifying the generator input-output pair as model-generated or real data. Based on the discriminator loss, the prompt modifier proposes possible edits to the generator and discriminator prompts, and the edits that most improve the adversarial loss are selected. We show that adv-ICL results in significant improvements over state-of-the-art prompt optimization techniques for both open and closed-source models on 11 generation and classification tasks including summarization, arithmetic reasoning, machine translation, data-to-text generation, and the MMLU and big-bench hard benchmarks. In addition, because our method uses pre-trained models and updates only prompts rather than model parameters, it is computationally efficient, easy to extend to any LLM and task, and effective in low-resource settings.

arxiv情報

著者	Xuan Long Do,Yiran Zhao,Hannah Brown,Yuxi Xie,James Xu Zhao,Nancy F. Chen,Kenji Kawaguchi,Michael Qizhe Xie,Junxian He
発行日	2023-12-05 09:44:45+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

Prompt Optimization via Adversarial In-Context Learning

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー