ProxyLM: Predicting Language Model Performance on Multilingual Tasks via Proxy Models

要約

パフォーマンス予測は、さまざまな自然言語処理 (NLP) タスクにおける言語モデル (LM) のパフォーマンスを推定し、モデルの容量と微調整用のデータに関連する計算コストを軽減する方法です。
私たちの論文では、多言語タスクでプロキシモデルを使用して LM パフォーマンスを予測するためのスケーラブルなフレームワークである ProxyLM を紹介します。
これらのプロキシモデルは代理として機能し、対象の LM のパフォーマンスを近似します。
プロキシモデルを活用することで、ProxyLM はタスク評価の計算オーバーヘッドを大幅に削減し、最小のプロキシモデルでも従来の方法と比較して最大 37.08 倍の高速化を達成します。
さらに、私たちの方法論は、事前トレーニングされた LM でこれまで見たことのない言語への適応性を示し、二乗平均平方根誤差 (RMSE) で測定した場合、最先端のパフォーマンスを 1.89 倍上回っています。
このフレームワークはモデルの選択を合理化し、大規模な計算リソースを必要とせずに効率的な導入と反復的な LM 拡張を可能にします。

要約(オリジナル)

Performance prediction is a method to estimate the performance of Language Models (LMs) on various Natural Language Processing (NLP) tasks, mitigating computational costs associated with model capacity and data for fine-tuning. Our paper introduces ProxyLM, a scalable framework for predicting LM performance using proxy models in multilingual tasks. These proxy models act as surrogates, approximating the performance of the LM of interest. By leveraging proxy models, ProxyLM significantly reduces computational overhead on task evaluations, achieving up to a 37.08x speedup compared to traditional methods, even with our smallest proxy models. Additionally, our methodology showcases adaptability to previously unseen languages in pre-trained LMs, outperforming the state-of-the-art performance by 1.89x as measured by root-mean-square error (RMSE). This framework streamlines model selection, enabling efficient deployment and iterative LM enhancements without extensive computational resources.

arxiv情報

著者	David Anugraha,Genta Indra Winata,Chenyue Li,Patrick Amadeus Irawan,En-Shiun Annie Lee
発行日	2024-06-14 14:52:05+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

ProxyLM: Predicting Language Model Performance on Multilingual Tasks via Proxy Models

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー