SEER: Transformer-based Robust Time Series Forecasting via Automated Patch Enhancement and Replacement

📅 2026-01-31
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the challenge that existing patch-based time series forecasting methods struggle to dynamically identify and handle low-quality patches containing missing values, anomalies, or noise, thereby compromising predictive performance. To this end, the authors propose SEER, a novel framework that integrates a Mixture-of-Experts (MoE) architecture with a channel-adaptive awareness mechanism to enhance patch representations. SEER further introduces a two-stage learnable patch replacement strategy: it first dynamically filters out low-quality patches and then substitutes them with a global sequence representation, followed by causal attention to refine the features. Extensive experiments demonstrate that SEER significantly outperforms state-of-the-art methods across multiple benchmark datasets, exhibiting superior robustness and prediction accuracy—particularly under realistic conditions involving noise, anomalies, and distribution shifts.

Technology Category

Application Category

📝 Abstract
Time series forecasting is important in many fields that require accurate predictions for decision-making. Patching techniques, commonly used and effective in time series modeling, help capture temporal dependencies by dividing the data into patches. However, existing patch-based methods fail to dynamically select patches and typically use all patches during the prediction process. In real-world time series, there are often low-quality issues during data collection, such as missing values, distribution shifts, anomalies and white noise, which may cause some patches to contain low-quality information, negatively impacting the prediction results. To address this issue, this study proposes a robust time series forecasting framework called SEER. Firstly, we propose an Augmented Embedding Module, which improves patch-wise representations using a Mixture-of-Experts (MoE) architecture and obtains series-wise token representations through a channel-adaptive perception mechanism. Secondly, we introduce a Learnable Patch Replacement Module, which enhances forecasting robustness and model accuracy through a two-stage process: 1) a dynamic filtering mechanism eliminates negative patch-wise tokens; 2) a replaced attention module substitutes the identified low-quality patches with global series-wise token, further refining their representations through a causal attention mechanism. Comprehensive experimental results demonstrate the SOTA performance of SEER.
Problem

Research questions and friction points this paper is trying to address.

time series forecasting
patch-based methods
low-quality data
anomalies
distribution shifts
Innovation

Methods, ideas, or system contributions that make the work stand out.

Patch Replacement
Mixture-of-Experts
Robust Time Series Forecasting
Dynamic Filtering
Causal Attention
🔎 Similar Papers
No similar papers found.
Xiangfei Qiu
Xiangfei Qiu
Master Student, East China Normal University
Time SeriesBenchmarkingSpatio-temporal Data
Xvyuan Liu
Xvyuan Liu
East China Normal University
Time Series
T
Tianen Shen
School of Data Science and Engineering, East China Normal University, Shanghai, China
Xingjian Wu
Xingjian Wu
PHD Student, East China Normal University
Time Series AnalysisFoundation ModelMulti-Modality
H
Hanyin Cheng
School of Data Science and Engineering, East China Normal University, Shanghai, China
B
Bin Yang
School of Data Science and Engineering, East China Normal University, Shanghai, China
Jilin Hu
Jilin Hu
Professor, East China Normal University
Spatial-Temporal DataMachine LearningTransportation