FusionBench: A Comprehensive Benchmark of Deep Model Fusion

要約

ディープモデルフュージョンは、コスト効率とデータ効率の高い方法で、複数のディープニューラルネットワークの予測またはパラメーターを単一のモデルに統合する新しい技術です。
これにより、統合モデルは元のモデルの強みを活用し、そのパフォーマンスを超える可能性があります。
さまざまなディープモデルフュージョン技術が導入されていますが、その評価には一貫性がない傾向があり、分布の変化に対するその有効性と堅牢性を検証するには不十分であることがよくあります。
この問題に対処するために、ディープモデルフュージョン専用の最初の包括的なベンチマークである FusionBench を導入します。
FusionBench は、オープンボキャブラリーの画像分類、テキスト分類、テキストからテキストへの生成など、幅広いタスクをカバーします。
各カテゴリには、完全な微調整と LoRA 微調整の両方を備えた、対応するタスク固有のモデルを含む最大 8 つのタスクと、さまざまなサイズのモデルが含まれており、さまざまなマルチタスクモデルの融合手法を公平かつバランスよく比較できます。
タスク、モデルスケール、微調整戦略。
私たちは、幅広いディープモデルフュージョン技術を実装し、評価します。
これらの手法は、予測を組み合わせて全体のパフォーマンスを向上させるモデルアンサンブル手法から、さまざまなモデルを 1 つに統合するモデルのマージ、元のモデルのコンポーネントをアップスケールまたは再結合するモデル混合手法まで多岐にわたります。
FusionBench には現在 26 の異なるタスク、74 の微調整モデル、および 16 の融合テクニックが含まれており、私たちはより多くのタスク、モデル、融合テクニックでベンチマークを継続的に拡張することに取り組んでいます。
さらに、研究者がベンチマーク結果を理解し再現するのに役立つ、十分に文書化されたリソースとガイドラインのセットを提供します。
ホームページ https://tanganke.github.io/fusion_bench/

要約(オリジナル)

Deep model fusion is an emerging technique that unifies the predictions or parameters of several deep neural networks into a single model in a cost-effective and data-efficient manner. This enables the unified model to take advantage of the original models’ strengths, potentially exceeding their performance. Although a variety of deep model fusion techniques have been introduced, their evaluations tend to be inconsistent and often inadequate to validate their effectiveness and robustness against distribution shifts. To address this issue, we introduce FusionBench, which is the first comprehensive benchmark dedicated to deep model fusion. FusionBench covers a wide range of tasks, including open-vocabulary image classification, text classification, and text-to-text generation. Each category includes up to eight tasks with corresponding task-specific models, featuring both full fine-tuning and LoRA fine-tuning, as well as models of different sizes, to ensure fair and balanced comparisons of various multi-task model fusion techniques across different tasks, model scales, and fine-tuning strategies. We implement and evaluate a broad spectrum of deep model fusion techniques. These techniques range from model ensemble methods, which combine the predictions to improve the overall performance, to model merging, which integrates different models into a single one, and model mixing methods, which upscale or recombine the components of the original models. FusionBench now contains 26 distinct tasks, 74 fine-tuned models, and 16 fusion techniques, and we are committed to consistently expanding the benchmark with more tasks, models, and fusion techniques. In addition, we offer a well-documented set of resources and guidelines to aid researchers in understanding and replicating the benchmark results. Homepage https://tanganke.github.io/fusion_bench/

arxiv情報

著者	Anke Tang,Li Shen,Yong Luo,Han Hu,Bo Do,Dacheng Tao
発行日	2024-06-05 13:54:28+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

FusionBench: A Comprehensive Benchmark of Deep Model Fusion

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー