CombLM: Adapting Black-Box Language Models through Small Fine-Tuned Models

要約

言語モデル (LM) を新しいタスクやドメインに適応させる方法は、伝統的にモデルへのホワイトボックスアクセスを想定しており、そのパラメーターを変更することで機能します。
ただし、これは、最高品質のモデルが推論 API を介したブラックボックスとしてのみ利用可能であるという、この分野の最近の傾向と互換性がありません。
モデルの重みが利用可能な場合でも、大規模な LM を微調整するための計算コストは、ほとんどの実践者にとって法外なコストとなる可能性があります。
この研究では、重みや中間アクティベーションにアクセスしないことを前提として、大規模な LM を新しいドメインやタスクに適応させるための軽量な方法を紹介します。
私たちのアプローチは、小さなホワイトボックス LM を微調整し、小さな検証セットで学習された小さなネットワークを通じて確率レベルで大きなブラックボックス LM と組み合わせます。
大規模な LM (OPT-30B) をいくつかのドメインと下流タスク (機械翻訳) に適応させることでアプローチを検証し、23 倍小さいドメインエキスパートを使用しながら、すべてのケースで最大 9\% のパフォーマンスの向上を観察しました。

要約(オリジナル)

Methods for adapting language models (LMs) to new tasks and domains have traditionally assumed white-box access to the model, and work by modifying its parameters. However, this is incompatible with a recent trend in the field, where the highest quality models are only available as black-boxes through inference APIs. Even when the model weights are available, the computational cost of fine-tuning large LMs can be prohibitive for most practitioners. In this work, we present a lightweight method for adapting large LMs to new domains and tasks, assuming no access to their weights or intermediate activations. Our approach fine-tunes a small white-box LM and combines it with the large black-box LM at the probability level through a small network, learned on a small validation set. We validate our approach by adapting a large LM (OPT-30B) to several domains and a downstream task (machine translation), observing improved performance in all cases, of up to 9\%, while using a domain expert 23x smaller.

arxiv情報

著者	Aitor Ormazabal,Mikel Artetxe,Eneko Agirre
発行日	2023-05-22 16:32:56+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

CombLM: Adapting Black-Box Language Models through Small Fine-Tuned Models

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー