On-Device Collaborative Language Modeling via a Mixture of Generalists and Specialists

要約

デバイス上のLLMは、プライバシーを強化し、パーソナライズされたユーザーエクスペリエンスを提供する能力について注目を集めています。
希少なデータを使用して個人学習を促進するために、連邦学習は標準的なアプローチになりました。
ただし、計算リソースの不均一性やデータの不均一性などの課題に直面しています。
comigs（$ \ textbf {co} $ llaborative Learningを提案します。
。
私たちの方法の重要な革新は、ターゲット分布との整合を確保するために、ルーターが個別の検証セットを使用してルーターが最適化されている、専門家の混合学習目標の双レベルの最適化定式化です。
私たちは、理論的分析を提供する交互の最小化で目標を解決します。
私たちの方法は、さまざまな数の専門家の専門家をローカライズしながら、ユーザー全体で一般主義者の専門家を共有し、それによりユーザーの計算リソースに適応し、プライバシーを維持します。
広範な実験を通じて、コミグは、各トークンの世代の一般的な知識とパーソナライズされた知識のバランスを効果的にバランスさせます。
comigは、専門家の専門知識を通じてローカルデータに適応しながら、ジェネラリストの正規化効果に対する過剰な存在に対して堅牢なままであることを実証します。
共同LLMのコードベースをオープンします。

要約(オリジナル)

On-device LLMs have gained increasing attention for their ability to enhance privacy and provide a personalized user experience. To facilitate private learning with scarce data, Federated Learning has become a standard approach. However, it faces challenges such as computational resource heterogeneity and data heterogeneity among end users. We propose CoMiGS ($\textbf{Co}$llaborative learning with a $\textbf{Mi}$xture of $\textbf{G}$eneralists and $\textbf{S}$pecialists), the first approach to address both challenges. A key innovation of our method is the bi-level optimization formulation of the Mixture-of-Experts learning objective, where the router is optimized using a separate validation set to ensure alignment with the target distribution. We solve our objective with alternating minimization, for which we provide a theoretical analysis. Our method shares generalist experts across users while localizing a varying number of specialist experts, thereby adapting to users’ computational resources and preserving privacy. Through extensive experiments, we show CoMiGS effectively balances general and personalized knowledge for each token generation. We demonstrate that CoMiGS remains robust against overfitting-due to the generalists’ regularizing effect-while adapting to local data through specialist expertise. We open source our codebase for collaborative LLMs.

arxiv情報

著者	Dongyang Fan,Bettina Messmer,Nikita Doikov,Martin Jaggi
発行日	2025-02-18 16:27:26+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

On-Device Collaborative Language Modeling via a Mixture of Generalists and Specialists

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー