Non-Intrusive Adaptation: Input-Centric Parameter-efficient Fine-Tuning for Versatile Multimodal Modeling

要約

大規模言語モデル (LLM) とビジョン言語モデル (VLM) は、パラメーター数を O(10^9) レベルから O(10^{12}) レベル、さらにそれ以上にスケールアップすることで、幅広いタスクで優れたパフォーマンスを示します。
このような大規模なスケールにより、対象となるタスクに完全に特化したモデルを適応させて展開することが不可能になります。
パラメーター効率の良い微調整 (PEFT) は、このような大規模モデルの適応とサービス提供の課題に取り組むための有望な方向性として浮上しています。
私たちは PEFT 技術を侵入型と非侵入型の 2 つのタイプに分類します。
侵入型 PEFT テクニックは、モデルの内部アーキテクチャを直接変更します。
より柔軟ではありますが、トレーニングとサービスがかなり複雑になります。
非侵入型 PEFT 手法では、内部アーキテクチャは変更されず、入力の埋め込みなどのモデル外部パラメーターのみが適応されます。
この研究では、AdaLink を、さまざまなタスクで SoTA 侵入型 PEFT (LoRA) やフルモデルファインチューニング (FT) と比較して競争力のあるパフォーマンスを達成する非侵入型 PEFT 技術として説明します。
テキストのみのタスクとマルチモーダルタスクの両方を使用して評価し、パラメーター数のスケーリングとトレーニングレジーム (命令チューニングの有無にかかわらず) の両方を考慮した実験を行います。

要約(オリジナル)

Large language models (LLMs) and vision language models (VLMs) demonstrate excellent performance on a wide range of tasks by scaling up parameter counts from O(10^9) to O(10^{12}) levels and further beyond. These large scales make it impossible to adapt and deploy fully specialized models given a task of interest. Parameter-efficient fine-tuning (PEFT) emerges as a promising direction to tackle the adaptation and serving challenges for such large models. We categorize PEFT techniques into two types: intrusive and non-intrusive. Intrusive PEFT techniques directly change a model’s internal architecture. Though more flexible, they introduce significant complexities for training and serving. Non-intrusive PEFT techniques leave the internal architecture unchanged and only adapt model-external parameters, such as embeddings for input. In this work, we describe AdaLink as a non-intrusive PEFT technique that achieves competitive performance compared to SoTA intrusive PEFT (LoRA) and full model fine-tuning (FT) on various tasks. We evaluate using both text-only and multimodal tasks, with experiments that account for both parameter-count scaling and training regime (with and without instruction tuning).

arxiv情報

著者	Yaqing Wang,Jialin Wu,Tanmaya Dabral,Jiageng Zhang,Geoff Brown,Chun-Ta Lu,Frederick Liu,Yi Liang,Bo Pang,Michael Bendersky,Radu Soricut
発行日	2023-10-18 16:43:08+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

Non-Intrusive Adaptation: Input-Centric Parameter-efficient Fine-Tuning for Versatile Multimodal Modeling

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー