FuseRL: Dense Preference Optimization for Heterogeneous Model Fusion

要約

異種モデルの融合は、複数の構造的に多様なモデルの知識と能力を統合することにより、LLMのパフォーマンスを向上させます。
ただし、既存のアプローチは、ソースモデルから各プロンプトに最適な出力を選択することにのみ依存していることが多く、ソースの知識が限られているため、最適化信号がまばらになるため、潜在能力を最大限に活用していません。
この制限に対処するために、FusesftとFusepoを含む2段階の新しいフレームワークであるFuserlを提案して、ソースLLMの利用を最大化します。
Fusesftは、各プロンプトの多様な出力に加重された監視された微調整（SFT）を介して不均一なソースモデルの強度を統合することにより、堅牢な初期化を確立します。
FUSEPOは、複数のソースモデルの出力に基づいて加重設定を最適化して、優れたアライメントパフォーマンスを可能にします。
広範な実験は、RLOO、DPO、SIMPOなど、さまざまな好みのアライメント方法にわたるフレームワークの有効性を示しています。
ターゲットモデルとしてllama-3.1-8b-instructを使用して、私たちのアプローチは、Alpacaeval-2およびArena-Hardベンチマークで8B LLMの最先端のパフォーマンスを達成します。
さらなる分析では、Fusesftがトレーニングプロセスを正規化して過剰適合を減らすことが示唆され、Fusepoは好みの最適化のために密集した多様なシグナルを導入します。

要約(オリジナル)

Heterogeneous model fusion enhances the performance of LLMs by integrating the knowledge and capabilities of multiple structurally diverse models. However, existing approaches often rely solely on selecting the best output for each prompt from source models, which underutilizes their full potential due to limited source knowledge and results in sparse optimization signals. To address this limitation, we propose FuseRL, a novel two-stage framework comprising FuseSFT and FusePO to maximize the utilization of source LLMs. FuseSFT establishes a robust initialization by integrating the strengths of heterogeneous source models through weighted supervised fine-tuning (SFT) on diverse outputs for each prompt. FusePO optimizes weighted preferences based on the outputs of multiple source models to enable superior alignment performance. Extensive experiments demonstrate the effectiveness of our framework across various preference alignment methods, including RLOO, DPO, and SimPO. Using Llama-3.1-8B-Instruct as the target model, our approach achieves state-of-the-art performance among 8B LLMs on the AlpacaEval-2 and Arena-Hard benchmarks. Further analysis suggests that FuseSFT regularizes the training process to reduce overfitting, while FusePO introduces dense and diverse signals for preference optimization.

arxiv情報

著者	Longguang Zhong,Fanqi Wan,Ziyi Yang,Guosheng Liang,Tianyuan Shi,Xiaojun Quan
発行日	2025-04-17 09:49:43+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

FuseRL: Dense Preference Optimization for Heterogeneous Model Fusion

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー