Scaling-laws for Large Time-series Models

要約

大規模言語モデル (LLM) のスケーリング則は、予測可能なパフォーマンス向上を実現するためにこれまで以上に大規模なモデルをトレーニングする際に有用なガイダンスを提供します。
時系列予測は言語と同様の順序構造を共有しており、大規模なトランスフォーマーアーキテクチャに適しています。
ここでは、基本的なデコーダのみの時系列変換モデルが LLM と同様のスケーリング動作を示し、アーキテクチャの詳細 (アスペクト比とヘッドの数) の影響が広範囲にわたって最小限であることを示します。
私たちは、トレーニングの対象となる異種時系列データの大規模なコーパスを組み立て、5 桁にわたるパラメーター数、データセットサイズ、トレーニングコンピューティングによるべき乗則スケーリングを初めて確立しました。

要約(オリジナル)

Scaling laws for large language models (LLMs) have provided useful guidance in training ever larger models for predictable performance gains. Time series forecasting shares a similar sequential structure to language, and is amenable to large-scale transformer architectures. Here we show that foundational decoder-only time series transformer models exhibit analogous scaling-behavior to LLMs, with architectural details (aspect ratio and number of heads) having a minimal effect over broad ranges. We assemble a large corpus of heterogenous time series data on which to train, and establish for the first time power-law scaling with parameter count, dataset size, and training compute, spanning five orders of magnitude.

arxiv情報

著者	Thomas D. P. Edwards,James Alvey,Justin Alsing,Nam H. Nguyen,Benjamin D. Wandelt
発行日	2025-01-08 14:08:11+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

Scaling-laws for Large Time-series Models

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー