Federated Learning of Large Language Models with Parameter-Efficient Prompt Tuning and Adaptive Optimization

要約

フェデレーションラーニング (FL) は、分散データを使用した協調的なモデルトレーニングを可能にする有望なパラダイムです。
ただし、大規模言語モデル (LLM) のトレーニングプロセスでは一般に重要なパラメーターの更新が発生するため、実際のシナリオで LLM に取り組むための FL テクニックの適用可能性が制限されます。
迅速な調整により、更新するパラメータの数を大幅に減らすことができますが、パフォーマンスの低下やトレーニング効率の低下が発生します。
FL でプロンプトチューニングをそのまま利用すると、多くの場合、無視できない通信コストが発生し、パフォーマンスが大幅に低下します。
さらに、分散データは通常、非独立かつ同一分散 (非 IID) であるため、クライアントドリフトの問題が発生し、パフォーマンスが低下します。
この論文では、LLM の効率的かつ効果的な FL を可能にする、適応最適化 (つまり FedPepTAO) を使用したパラメータ効率の高いプロンプトチューニングアプローチを提案します。
まず、パフォーマンスと効率を同時に向上させる、効率的な部分プロンプトチューニングアプローチを提案します。
次に、デバイス側とサーバー側の両方でクライアントドリフトの問題に対処し、パフォーマンスをさらに向上させる新しい適応最適化手法が開発されています。
10 のデータセットに基づく広範な実験により、9 つのベースラインアプローチと比較した FedPepTAO の優れたパフォーマンス (精度の点で最大 60.8\%) と効率 (トレーニング時間の点で最大 97.59\%) が実証されました。
私たちのコードは https://github.com/llm-eff/FedPepTAO で入手できます。

要約(オリジナル)

Federated learning (FL) is a promising paradigm to enable collaborative model training with decentralized data. However, the training process of Large Language Models (LLMs) generally incurs the update of significant parameters, which limits the applicability of FL techniques to tackle the LLMs in real scenarios. Prompt tuning can significantly reduce the number of parameters to update, but it either incurs performance degradation or low training efficiency. The straightforward utilization of prompt tuning in the FL often raises non-trivial communication costs and dramatically degrades performance. In addition, the decentralized data is generally non-Independent and Identically Distributed (non-IID), which brings client drift problems and thus poor performance. This paper proposes a Parameter-efficient prompt Tuning approach with Adaptive Optimization, i.e., FedPepTAO, to enable efficient and effective FL of LLMs. First, an efficient partial prompt tuning approach is proposed to improve performance and efficiency simultaneously. Second, a novel adaptive optimization method is developed to address the client drift problems on both the device and server sides to enhance performance further. Extensive experiments based on 10 datasets demonstrate the superb performance (up to 60.8\% in terms of accuracy) and efficiency (up to 97.59\% in terms of training time) of FedPepTAO compared with 9 baseline approaches. Our code is available at https://github.com/llm-eff/FedPepTAO.

arxiv情報

著者	Tianshi Che,Ji Liu,Yang Zhou,Jiaxiang Ren,Jiwen Zhou,Victor S. Sheng,Huaiyu Dai,Dejing Dou
発行日	2023-10-23 16:37:59+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

Federated Learning of Large Language Models with Parameter-Efficient Prompt Tuning and Adaptive Optimization

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー