XFT: Unlocking the Power of Code Instruction Tuning by Simply Merging Upcycled Mixture-of-Experts

要約

アップサイクルされた専門家混合 (MoE) をマージするだけで、命令調整されたコード大規模言語モデル (LLM) のパフォーマンス限界を解放する、シンプルかつ強力なトレーニングスキームである XFT を紹介します。
バニラのスパースアップサイクルでは命令チューニングを改善できませんが、XFT は新しいルーティング重み正規化戦略を備えた共有エキスパートメカニズムをスパースアップサイクルに導入し、命令チューニングを大幅に向上させます。
アップサイクルされた MoE モデルを微調整した後、XFT は、アップサイクルされた MoE モデルをコンパイルして高密度モデルに戻す学習可能なモデルマージメカニズムを導入し、高密度モデルのコンピューティングのみでアップサイクルされた MoE レベルのパフォーマンスを実現します。
XFT を 1.3B モデルに適用することで、HumanEval と HumanEval+ でそれぞれ 67.1 および 64.6 pass@1 の新しい最先端の小さなコード LLM (<3B) を作成します。 XFT は、同じデータとモデルアーキテクチャを使用して、HumanEval+ で教師あり微調整 (SFT) を 13% 改善し、MBPP+、MultiPL-E、および DS-1000 で 2% から 13% まで一貫して改善し、その汎用性を示しています。 XFT は、Evol-Instruct や OSS-Instruct などの既存の技術と完全に直交しており、コード命令のチューニングを改善するための新しい次元を開きます。コードは https://github.com/ise-uiuc/xft で入手できます。

要約(オリジナル)

We introduce XFT, a simple yet powerful training scheme, by simply merging upcycled Mixture-of-Experts (MoE) to unleash the performance limit of instruction-tuned code Large Language Models (LLMs). While vanilla sparse upcycling fails to improve instruction tuning, XFT introduces a shared expert mechanism with a novel routing weight normalization strategy into sparse upcycling, which significantly boosts instruction tuning. After fine-tuning the upcycled MoE model, XFT introduces a learnable model merging mechanism to compile the upcycled MoE model back to a dense model, achieving upcycled MoE-level performance with only dense-model compute. By applying XFT to a 1.3B model, we create a new state-of-the-art tiny code LLM (<3B) with 67.1 and 64.6 pass@1 on HumanEval and HumanEval+ respectively. With the same data and model architecture, XFT improves supervised fine-tuning (SFT) by 13% on HumanEval+, along with consistent improvements from 2% to 13% on MBPP+, MultiPL-E, and DS-1000, demonstrating its generalizability. XFT is fully orthogonal to existing techniques such as Evol-Instruct and OSS-Instruct, opening a new dimension for improving code instruction tuning. Codes are available at https://github.com/ise-uiuc/xft .

arxiv情報

著者	Yifeng Ding,Jiawei Liu,Yuxiang Wei,Terry Yue Zhuo,Lingming Zhang
発行日	2024-04-23 17:32:24+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

XFT: Unlocking the Power of Code Instruction Tuning by Simply Merging Upcycled Mixture-of-Experts

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー