Can a Crow Hatch a Falcon? Lineage Matters in Predicting Large Language Model Performance

要約

大規模な微調整またはマージの前に、大規模な言語モデル（LLM）のパフォーマンスを正確に予測すると、計算費用と開発時間の両方を大幅に短縮できます。
スケーリング法のような以前のアプローチは、パラメーターサイズやトレーニングトークンなどのグローバル要因を説明していますが、しばしば明示的な系統関係を見落としています。
この作業では、グラフLaplacian Reglemalizerを介してLLMの祖先のつながりをコードする新しい系統正規化マトリックス因数分解（LRMF）フレームワークを提案します。
マルチホップの親子接続を活用することにより、LRMFは、インスタンスレベルとベンチマークレベルのパフォーマンス予測の両方で、従来のマトリックスの因数分解と共同フィルタリング方法を一貫して上回ります。
私たちの大規模な研究には、6つの主要なベンチマークにわたって2,934の公的に利用可能なハグの顔モデルと21,000以上のインスタンスが含まれており、系統の制約により、ベースラインと比較して実際のパフォーマンスと最大7〜10パーセントポイント高い相関が得られます。
さらに、LRMFはコールドスタートの問題に効果的に対処し、データを最小限に抑えても新たに派生またはマージされたモデルの正確な推定値を提供します。
したがって、この系統誘導戦略は、最新のLLM開発におけるハイパーパラメーターのチューニング、データ選択、およびモデルの組み合わせを通知するためのリソース効率の高い方法を提供します。

要約(オリジナル)

Accurately forecasting the performance of Large Language Models (LLMs) before extensive fine-tuning or merging can substantially reduce both computational expense and development time. Although prior approaches like scaling laws account for global factors such as parameter size or training tokens, they often overlook explicit lineage relationships – i.e., which models are derived or merged from which parents. In this work, we propose a novel Lineage-Regularized Matrix Factorization (LRMF) framework that encodes ancestral ties among LLMs via a graph Laplacian regularizer. By leveraging multi-hop parent-child connections, LRMF consistently outperforms conventional matrix factorization and collaborative filtering methods in both instance-level and benchmark-level performance prediction. Our large-scale study includes 2,934 publicly available Hugging Face models and 21,000+ instances across 6 major benchmarks, showing that lineage constraints yield up to 7-10 percentage points higher correlation with actual performance compared to baselines. Moreover, LRMF effectively addresses the cold-start problem, providing accurate estimates for newly derived or merged models even with minimal data. This lineage-guided strategy thus offers a resource-efficient way to inform hyperparameter tuning, data selection, and model combination in modern LLM development.

arxiv情報

著者	Takuya Tamura,Taro Yano,Masafumi Enomoto,Masafumi Oyamada
発行日	2025-04-28 14:08:45+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

Can a Crow Hatch a Falcon? Lineage Matters in Predicting Large Language Model Performance

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー