VisionTS: Visual Masked Autoencoders Are Free-Lunch Zero-Shot Time Series Forecasters

要約

基礎モデルは、時系列予測 (TSF) における有望なアプローチとして浮上しています。
既存のアプローチでは、大規模言語モデル (LLM) を微調整するか、大規模な時系列データセットを構築して TSF 基盤モデルを開発します。
ただし、これらの方法は、ドメイン間のギャップやドメイン内の異質性が深刻であるため、課題に直面しています。
この論文では、画像と時系列間の本質的な類似性に基づいて、豊富で高品質の自然画像から TSF 基礎モデルを構築するための新しい道を探ります。
2 つのドメイン間のギャップを埋めるために、TSF タスクを画像再構成タスクとして再定式化します。これは、ImageNet データセットで事前トレーニングされた自己監視型ビジュアルマスクオートエンコーダー (MAE) によってさらに処理されます。
驚くべきことに、時系列領域でさらなる適応を行わなくても、提案された VisionTS は既存の TSF 基礎モデルと比較して優れたゼロショット予測パフォーマンスを達成できました。
最小限の微調整で、VisionTS は予測をさらに改善し、ほとんどの場合、最先端のパフォーマンスを達成できます。
これらの発見は、視覚モデルが TSF にとってフリーランチになる可能性があることを示唆しており、コンピュータービジョンと TSF の間の将来のクロスドメイン研究の可能性を浮き彫りにしています。
私たちのコードは https://github.com/Keytoyze/VisionTS で公開されています。

要約(オリジナル)

Foundation models have emerged as a promising approach in time series forecasting (TSF). Existing approaches either fine-tune large language models (LLMs) or build large-scale time-series datasets to develop TSF foundation models. However, these methods face challenges due to the severe cross-domain gap or in-domain heterogeneity. In this paper, we explore a new road to building a TSF foundation model from rich and high-quality natural images, based on the intrinsic similarities between images and time series. To bridge the gap between the two domains, we reformulate the TSF task as an image reconstruction task, which is further processed by a visual masked autoencoder (MAE) self-supervised pre-trained on the ImageNet dataset. Surprisingly, without further adaptation in the time-series domain, the proposed VisionTS could achieve superior zero-shot forecasting performance compared to existing TSF foundation models. With minimal fine-tuning, VisionTS could further improve the forecasting and achieve state-of-the-art performance in most cases. These findings suggest that visual models could be a free lunch for TSF and highlight the potential for future cross-domain research between computer vision and TSF. Our code is publicly available at https://github.com/Keytoyze/VisionTS.

arxiv情報

著者	Mouxiang Chen,Lefei Shen,Zhuo Li,Xiaoyun Joy Wang,Jianling Sun,Chenghao Liu
発行日	2024-08-30 12:51:55+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

VisionTS: Visual Masked Autoencoders Are Free-Lunch Zero-Shot Time Series Forecasters

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー