GuiLoMo: Allocating Expert Number and Rank for LoRA-MoE via Bilevel Optimization with GuidedSelection Vectors

要約

パラメーター効率の高い微調整（PEFT）メソッド、特に低ランク適応（LORA）は、計算コストを削減して大規模な言語モデルを適応させる効率的な方法を提供します。
ただし、パフォーマンスは少数のトレーニング可能なパラメーターによって制限されています。
最近の研究では、LORAとExperts（MOE）、つまりLora-Moeを組み合わせて容量を強化しますが、2つの制限は、その可能性の完全な搾取を妨げています。
これらのギャップを緩和するために、Guilomoは、GuidedSelection Vectors（GSVS）を使用した、きめ細かい層と層の専門家数とランク配分戦略を提案します。
GSVは、モデル固有のニーズとタスク固有の両方のニーズをキャプチャするために、以前のバイレベル最適化プロセスを介して学習され、最適な専門家数とランクを割り当てるために使用されます。
多様なベンチマーク全体の3つのバックボーンモデルでの実験は、Guilomoがすべてのベースラインよりも優れたまたは同等のパフォーマンスを達成することを示しています。
さらなる分析は、専門家の数字とランクがレイヤーとタスクによってどのように異なるかについての重要な洞察を提供し、適応的な専門家の構成の利点を強調しています。
私たちのコードは、https：//github.com/liar406/gui-lomo.gitで入手できます。

要約(オリジナル)

Parameter-efficient fine-tuning (PEFT) methods, particularly Low-Rank Adaptation (LoRA), offer an efficient way to adapt large language models with reduced computational costs. However, their performance is limited by the small number of trainable parameters. Recent work combines LoRA with the Mixture-of-Experts (MoE), i.e., LoRA-MoE, to enhance capacity, but two limitations remain in hindering the full exploitation of its potential: 1) the influence of downstream tasks when assigning expert numbers, and 2) the uniform rank assignment across all LoRA experts, which restricts representational diversity. To mitigate these gaps, we propose GuiLoMo, a fine-grained layer-wise expert numbers and ranks allocation strategy with GuidedSelection Vectors (GSVs). GSVs are learned via a prior bilevel optimization process to capture both model- and task-specific needs, and are then used to allocate optimal expert numbers and ranks. Experiments on three backbone models across diverse benchmarks show that GuiLoMo consistently achieves superior or comparable performance to all baselines. Further analysis offers key insights into how expert numbers and ranks vary across layers and tasks, highlighting the benefits of adaptive expert configuration. Our code is available at https://github.com/Liar406/Gui-LoMo.git.

arxiv情報

著者	Hengyuan Zhang,Xinrong Chen,Yingmin Qiu,Xiao Liang,Ziyue Li,Guanyu Wang,Weiping Li,Tong Mo,Wenyue Li,Hayden Kwok-Hay So,Ngai Wong
発行日	2025-06-17 15:41:33+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

GuiLoMo: Allocating Expert Number and Rank for LoRA-MoE via Bilevel Optimization with GuidedSelection Vectors

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー