LMD3: Language Model Data Density Dependence

要約

私たちは、トレーニングデータ密度の推定に基づいて、個々のサンプルレベルで言語モデルのタスクのパフォーマンスを分析する方法論を開発します。
データの微調整に対する制御された介入として言い換えを行った実験では、特定のテストクエリに対するトレーニング分布のサポートを増やすと、測定可能な密度の増加がもたらされ、これは介入によって引き起こされるパフォーマンスの向上の重要な予測因子でもあることを示しています。
事前トレーニングデータを使用した実験により、モデルの複雑さの分散のかなりの部分が密度測定によって説明できることが実証されました。
私たちのフレームワークは、トレーニングデータのサブセットに対するターゲットモデルの予測の依存性の統計的証拠を提供でき、より一般的には、特定のテストタスクのトレーニングデータのサポート (またはその欠如) を特徴付けるために使用できると結論付けています。

要約(オリジナル)

We develop a methodology for analyzing language model task performance at the individual example level based on training data density estimation. Experiments with paraphrasing as a controlled intervention on finetuning data demonstrate that increasing the support in the training distribution for specific test queries results in a measurable increase in density, which is also a significant predictor of the performance increase caused by the intervention. Experiments with pretraining data demonstrate that we can explain a significant fraction of the variance in model perplexity via density measurements. We conclude that our framework can provide statistical evidence of the dependence of a target model’s predictions on subsets of its training data, and can more generally be used to characterize the support (or lack thereof) in the training data for a given test task.

arxiv情報

著者	John Kirchenbauer,Garrett Honke,Gowthami Somepalli,Jonas Geiping,Daphne Ippolito,Katherine Lee,Tom Goldstein,David Andre
発行日	2024-05-10 09:03:27+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

LMD3: Language Model Data Density Dependence

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー