On the Regularization of Learnable Embeddings for Time Series Forecasting

📅 2024-10-18
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
In multivariate time series forecasting, local embeddings often degenerate into sequence identifiers, undermining model generalization and transferability. To address this, this paper initiates the first systematic study of embedding regularization. We propose a suite of synergistic regularization strategies—including embedding perturbation (with periodic reset), contrastive constraints, sparsity enforcement, and Dropout variants—to prevent local embeddings from overfitting to individual sequence IDs and instead encourage them to capture transferable temporal patterns. Integrated into mainstream architectures such as Informer and Autoformer, our approach achieves average MAE reductions of 3.2–7.8% across multiple benchmark datasets. Notably, it significantly improves zero-shot and few-shot generalization performance. This work establishes a reusable, empirically validated embedding regularization paradigm for time-series foundation models.

Technology Category

Application Category

📝 Abstract
In forecasting multiple time series, accounting for the individual features of each sequence can be challenging. To address this, modern deep learning methods for time series analysis combine a shared (global) model with local layers, specific to each time series, often implemented as learnable embeddings. Ideally, these local embeddings should encode meaningful representations of the unique dynamics of each sequence. However, when these are learned end-to-end as parameters of a forecasting model, they may end up acting as mere sequence identifiers. Shared processing blocks may then become reliant on such identifiers, limiting their transferability to new contexts. In this paper, we address this issue by investigating methods to regularize the learning of local learnable embeddings for time series processing. Specifically, we perform the first extensive empirical study on the subject and show how such regularizations consistently improve performance in widely adopted architectures. Furthermore, we show that methods attempting to prevent the co-adaptation of local and global parameters by means of embeddings perturbation are particularly effective in this context. In this regard, we include in the comparison several perturbation-based regularization methods, going as far as periodically resetting the embeddings during training. The obtained results provide an important contribution to understanding the interplay between learnable local parameters and shared processing layers: a key challenge in modern time series processing models and a step toward developing effective foundation models for time series.
Problem

Research questions and friction points this paper is trying to address.

Regularize local learnable embeddings
Prevent over-reliance on sequence identifiers
Improve model transferability and performance
Innovation

Methods, ideas, or system contributions that make the work stand out.

Regularizes learnable embeddings for time series
Prevents co-adaptation via embedding perturbation
Periodically resets embeddings during training
🔎 Similar Papers
No similar papers found.
L
L. Butera
IDSIA, Università della Svizzera Italiana
G
G. Felice
University of Liverpool
Andrea Cini
Andrea Cini
SNSF Postdoctoral Fellow at University of Oxford, Postdoctoral affiliate at USI and UiT
Graph processingTime SeriesMachine LearningArtificial Intelligence
C
C. Alippi
IDSIA, Università della Svizzera Italiana, Politecnico di Milano