SPAT: Sensitivity-based Multihead-attention Pruning on Time Series Forecasting Models

要約

注意ベースのアーキテクチャは、多変量時系列予測で優れた性能を達成していますが、計算上高価です。
パッチングや適応マスキングなどの技術は、サイズとレイテンシを減らすために開発されています。
この作業では、構造化されたプルーニングメソッド（$ \ textbf {s} $ ensitivity $ \ textbf {p} $ runer）を提案します。
以前のアプローチとは異なり、SPATは注意モジュール全体を削除することを目的としています。これにより、特殊なハードウェアを要求することなく、過剰適合のリスクを軽減し、スピードアップを可能にします。
動的感度メトリック、$ \ textbf {s} $ ensitivity $ \ textbf {e} $ nhanced $ \ textbf {n} $ ormalized $ \ textbf {d} $ ispersion（send）を提案します。
多変量データセットでの実験は、SPATが使用するモデルがMSEで2.842％、MAEで1.996％、フロップで35.274％の削減を達成することを示しています。
さらに、Spat-Prunedモデルは、標準およびゼロショット推論の両方で、既存の軽量、Mambaベース、LLMベースのSOTAメソッドよりも優れており、最も効果的な注意メカニズムのみを保持することの重要性を強調しています。
コードを公開されているhttps://anonymous.4open.science/r/spat-6042を公開しました。

要約(オリジナル)

Attention-based architectures have achieved superior performance in multivariate time series forecasting but are computationally expensive. Techniques such as patching and adaptive masking have been developed to reduce their sizes and latencies. In this work, we propose a structured pruning method, SPAT ($\textbf{S}$ensitivity $\textbf{P}$runer for $\textbf{At}$tention), which selectively removes redundant attention mechanisms and yields highly effective models. Different from previous approaches, SPAT aims to remove the entire attention module, which reduces the risk of overfitting and enables speed-up without demanding specialized hardware. We propose a dynamic sensitivity metric, $\textbf{S}$ensitivity $\textbf{E}$nhanced $\textbf{N}$ormalized $\textbf{D}$ispersion (SEND) that measures the importance of each attention module during the pre-training phase. Experiments on multivariate datasets demonstrate that SPAT-pruned models achieve reductions of 2.842% in MSE, 1.996% in MAE, and 35.274% in FLOPs. Furthermore, SPAT-pruned models outperform existing lightweight, Mamba-based and LLM-based SOTA methods in both standard and zero-shot inference, highlighting the importance of retaining only the most effective attention mechanisms. We have made our code publicly available https://anonymous.4open.science/r/SPAT-6042.

arxiv情報

著者	Suhan Guo,Jiahong Deng,Mengjun Yi,Furao Shen,Jian Zhao
発行日	2025-05-13 17:39:31+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

SPAT: Sensitivity-based Multihead-attention Pruning on Time Series Forecasting Models

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー