MoPE: Mixture of Prompt Experts for Parameter-Efficient and Scalable Multimodal Fusion

要約

プロンプトベースのマルチモーダル融合手法のパラメータ効率が実証されているにもかかわらず、その適応性と表現力が限られているため、他の調整アプローチと比較してパフォーマンスが最適以下になることがよくあります。
このペーパーでは、標準プロンプトを分解してインスタンスレベルの機能を適応的にキャプチャすることで、これらの制限を克服するように設計された最初の手法である Mixture of Prompt Experts (MoPE) を紹介します。
この分解に基づいて、MoPE はマルチモーダルペアリング事前設定を活用して、各インスタンスに対して最も効果的なプロンプトを動的にルーティングすることにより、プロンプトフュージョンの表現力を強化します。
バニラプロンプトと比較して、MoPE ベースの融合手法はより優れた表現力を示し、トレーニングデータとトレーニング可能なパラメーターの総数に合わせてより効果的にスケーリングします。
また、適応性と解釈可能性が強化された新たなエキスパート専門化につながる、エキスパートルーティングの正則化用語も調査します。
4 つのモダリティにまたがる 6 つのマルチモーダルデータセットにわたる広範な実験により、トレーニング可能なパラメーターのわずか 0.8% しか必要とせずに、微調整のパフォーマンスと同等、またはそれを超える、迅速な融合のための最先端のパフォーマンスが実証されました。
プロジェクトのホームページ: https://github.com/songrise/MoPE

要約(オリジナル)

Despite the demonstrated parameter efficiency of prompt-based multimodal fusion methods, their limited adaptivity and expressiveness often result in suboptimal performance compared to other tuning approaches. In this paper, we introduce the Mixture of Prompt Experts (MoPE), the first technique designed to overcome these limitations by decomposing standard prompts to capture instance-level features adaptively. Building on this decomposition, MoPE enhances prompt fusion’s expressiveness by leveraging multimodal pairing priors to route the most effective prompt for each instance dynamically. Compared to vanilla prompting, our MoPE-based fusion method exhibits greater expressiveness, scaling more effectively with the training data and the overall number of trainable parameters. We also investigate regularization terms for expert routing, which lead to emergent expert specialization with enhanced adaptiveness and interpretablity. Extensive experiments across six multimodal datasets spanning four modalities demonstrate state-of-the-art performance for prompt fusion, matching or even surpassing the performance of fine-tuning while requiring only 0.8% of the trainable parameters. Project homepage: https://github.com/songrise/MoPE

arxiv情報

著者	Ruixiang Jiang,Lingbo Liu,Changwen Chen
発行日	2025-01-14 08:01:17+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

MoPE: Mixture of Prompt Experts for Parameter-Efficient and Scalable Multimodal Fusion

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー