ProxyLM: Predicting Language Model Performance on Multilingual Tasks via Proxy Models

要約

パフォーマンス予測は、多言語言語モデル (LM) のパフォーマンスを推定する方法であり、モデルの容量と微調整用のデータに関連する計算コストを軽減します。
私たちの論文では、多言語タスクでプロキシモデルを使用して LM パフォーマンスを予測するためのスケーラブルなフレームワークである ProxyLM を紹介します。
これらのプロキシモデルはサロゲートとして機能し、特定の下流の自然言語処理 (NLP) タスクで微調整された LM のパフォーマンスを近似します。
プロキシモデルを活用することで、ProxyLM はタスク評価の計算オーバーヘッドを大幅に削減し、最小のプロキシモデルでも従来の方法と比較して最大 37.08 倍の高速化を達成します。
さらに、私たちの方法論は、事前トレーニングされた LM でこれまで見たことのない言語への適応性を示し、二乗平均平方根誤差 (RMSE) で測定した場合、最先端のパフォーマンスを 1.89 倍上回っています。
このフレームワークはモデルの選択を合理化し、大規模な計算リソースを必要とせずに効率的な導入と反復的な LM 拡張を可能にします。

要約(オリジナル)

Performance prediction is a method to estimate the performance of multilingual language models (LMs), mitigating computational costs associated with model capacity and data for fine-tuning. Our paper introduces ProxyLM, a scalable framework for predicting LM performance using proxy models in multilingual tasks. These proxy models act as surrogates, approximating the performance of fine-tuned LMs on specific downstream natural language processing (NLP) tasks. By leveraging proxy models, ProxyLM significantly reduces computational overhead on task evaluations, achieving up to a 37.08x speedup compared to traditional methods, even with our smallest proxy models. Additionally, our methodology showcases adaptability to previously unseen languages in pre-trained LMs, outperforming the state-of-the-art performance by 1.89x as measured by root-mean-square-error (RMSE). This framework streamlines model selection, enabling efficient deployment and iterative LM enhancements without extensive computational resources.

arxiv情報

著者	David Anugraha,Genta Indra Winata,Chenyue Li,Patrick Amadeus Irawan,En-Shiun Annie Lee
発行日	2024-06-13 17:15:33+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

ProxyLM: Predicting Language Model Performance on Multilingual Tasks via Proxy Models

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー