SEER: Transformer-based Robust Time Series Forecasting via Automated Patch Enhancement and Replacement

📅 2026-01-31

📈 Citations: 0

✨ Influential: 0

career value

192K/year

🤖 AI Summary

This work addresses the challenge that existing patch-based time series forecasting methods struggle to dynamically identify and handle low-quality patches containing missing values, anomalies, or noise, thereby compromising predictive performance. To this end, the authors propose SEER, a novel framework that integrates a Mixture-of-Experts (MoE) architecture with a channel-adaptive awareness mechanism to enhance patch representations. SEER further introduces a two-stage learnable patch replacement strategy: it first dynamically filters out low-quality patches and then substitutes them with a global sequence representation, followed by causal attention to refine the features. Extensive experiments demonstrate that SEER significantly outperforms state-of-the-art methods across multiple benchmark datasets, exhibiting superior robustness and prediction accuracy—particularly under realistic conditions involving noise, anomalies, and distribution shifts.

Technology Category

Application Category

📝 Abstract

Time series forecasting is important in many fields that require accurate predictions for decision-making. Patching techniques, commonly used and effective in time series modeling, help capture temporal dependencies by dividing the data into patches. However, existing patch-based methods fail to dynamically select patches and typically use all patches during the prediction process. In real-world time series, there are often low-quality issues during data collection, such as missing values, distribution shifts, anomalies and white noise, which may cause some patches to contain low-quality information, negatively impacting the prediction results. To address this issue, this study proposes a robust time series forecasting framework called SEER. Firstly, we propose an Augmented Embedding Module, which improves patch-wise representations using a Mixture-of-Experts (MoE) architecture and obtains series-wise token representations through a channel-adaptive perception mechanism. Secondly, we introduce a Learnable Patch Replacement Module, which enhances forecasting robustness and model accuracy through a two-stage process: 1) a dynamic filtering mechanism eliminates negative patch-wise tokens; 2) a replaced attention module substitutes the identified low-quality patches with global series-wise token, further refining their representations through a causal attention mechanism. Comprehensive experimental results demonstrate the SOTA performance of SEER.

Problem

Research questions and friction points this paper is trying to address.

time series forecasting

patch-based methods

low-quality data

anomalies

distribution shifts

Innovation

Methods, ideas, or system contributions that make the work stand out.

Patch Replacement

Mixture-of-Experts

Robust Time Series Forecasting