VisionTS: Visual Masked Autoencoders Are Free-Lunch Zero-Shot Time Series Forecasters

要約

基礎モデルは、時系列予測 (TSF) における有望なアプローチとして浮上しています。
既存のアプローチでは、大規模言語モデル (LLM) を再利用するか、大規模な時系列データセットを構築して、普遍的な予測のための TSF 基礎モデルを開発します。
ただし、これらの方法は、ドメイン間のギャップやドメイン内の異質性が深刻であるため、課題に直面しています。
このペーパーでは、リッチで高品質の自然画像から TSF 基礎モデルを構築するための新しい道を探ります。
私たちの重要な洞察は、ImageNet データセットで事前トレーニングされたビジュアルマスクオートエンコーダーは、自然に数値系列予測器になり得るということです。
TSF を画像再構成タスクとして再定式化することで、画像の事前トレーニングと TSF 下流タスクの間のギャップを埋めます。
驚くべきことに、時系列領域でさらなる適応を行わなくても、提案された VisionTS は既存の TSF 基礎モデルと比較して優れたゼロショット予測パフォーマンスを達成できました。
1 エポックの微調整により、VisionTS は予測をさらに改善し、ほとんどの場合で最先端のパフォーマンスを達成できます。
広範な実験により、画像と現実世界の時系列間の本質的な類似性が明らかになり、視覚モデルが TSF に「フリーランチ」を提供し、将来のクロスモダリティ研究の可能性を浮き彫りにする可能性があることが示唆されています。
私たちのコードは https://github.com/Keytoyze/VisionTS で公開されています。

要約(オリジナル)

Foundation models have emerged as a promising approach in time series forecasting (TSF). Existing approaches either repurpose large language models (LLMs) or build large-scale time series datasets to develop TSF foundation models for universal forecasting. However, these methods face challenges due to the severe cross-domain gap or in-domain heterogeneity. This paper explores a new road to building a TSF foundation model from rich, high-quality natural images. Our key insight is that a visual masked autoencoder, pre-trained on the ImageNet dataset, can naturally be a numeric series forecaster. By reformulating TSF as an image reconstruction task, we bridge the gap between image pre-training and TSF downstream tasks. Surprisingly, without further adaptation in the time-series domain, the proposed VisionTS could achieve superior zero-shot forecasting performance compared to existing TSF foundation models. With fine-tuning for one epoch, VisionTS could further improve the forecasting and achieve state-of-the-art performance in most cases. Extensive experiments reveal intrinsic similarities between images and real-world time series, suggesting visual models may offer a “free lunch” for TSF and highlight the potential for future cross-modality research. Our code is publicly available at https://github.com/Keytoyze/VisionTS.

arxiv情報

著者	Mouxiang Chen,Lefei Shen,Zhuo Li,Xiaoyun Joy Wang,Jianling Sun,Chenghao Liu
発行日	2024-10-02 17:21:47+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

VisionTS: Visual Masked Autoencoders Are Free-Lunch Zero-Shot Time Series Forecasters

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー