Federated Sketching LoRA: On-Device Collaborative Fine-Tuning of Large Language Models

要約

デバイス上の大規模な言語モデル（LLMS）を微調整することで、関心が高まっています。
最近の作品は、デバイスモデルのサイズとデータ不足に関連する課題を軽減するために、低ランク適応（LORA）技術をフェデレート微調整と融合しています。
それでも、計算リソースの不均一性は重要なボトルネックのままです。高いランクモジュールは一般にパフォーマンスを向上させますが、デバイス機能が変化するとLORAの実行可能なランク範囲が制約されます。
この問題を解決しようとする既存のアプローチは、分析的正当化を欠いているか、追加の計算オーバーヘッドを課し、効率的で理論的に接地されたソリューションのために広いギャップを残します。
これらの課題に対処するために、フェデレートスケッチLORA（fslora）を提案します。これは、サーバーが維持しているグローバルLORAモジュールのサブマトリックを選択的に更新できるようにデバイスを可能にするスケッチメカニズムを活用します。
デバイスのサブマトリックのランクを決定するスケッチ比を調整することにより、fsloraはデバイス固有の通信と計算の制約に柔軟に適応します。
スケッチ比が収束率にどのように影響するかを特徴付けるFsloraの厳密な収束分析を提供します。
複数のデータセットとLLMモデルでの包括的な実験を通じて、さまざまなベースラインと比較してFsloraの優れた性能を示します。

要約(オリジナル)

Fine-tuning large language models (LLMs) on devices is attracting increasing interest. Recent works have fused low-rank adaptation (LoRA) techniques with federated fine-tuning to mitigate challenges associated with device model sizes and data scarcity. Still, the heterogeneity of computational resources remains a critical bottleneck: while higher-rank modules generally enhance performance, varying device capabilities constrain LoRA’s feasible rank range. Existing approaches attempting to resolve this issue either lack analytical justification or impose additional computational overhead, leaving a wide gap for an efficient and theoretically-grounded solution. To address these challenges, we propose federated sketching LoRA (FSLoRA), which leverages a sketching mechanism to enable devices to selectively update submatrices of global LoRA modules maintained by the server. By adjusting the sketching ratios, which determine the ranks of the submatrices on the devices, FSLoRA flexibly adapts to device-specific communication and computational constraints. We provide a rigorous convergence analysis of FSLoRA that characterizes how the sketching ratios affect the convergence rate. Through comprehensive experiments on multiple datasets and LLM models, we demonstrate FSLoRA’s superior performance compared to various baselines.

arxiv情報

著者	Wenzhi Fang,Dong-Jun Han,Liangqi Yuan,Seyyedali Hosseinalipour,Christopher G. Brinton
発行日	2025-01-31 18:44:35+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

Federated Sketching LoRA: On-Device Collaborative Fine-Tuning of Large Language Models

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー