Beyond Holistic Models: Systematic Component-level Benchmarking of Deep Multivariate Time-Series Forecasting

📅 2026-05-26
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Existing research on multivariate time series forecasting primarily focuses on holistic model design, often lacking a systematic understanding of the roles played by individual internal components. This work proposes TSCOMP, a benchmark that decomposes deep forecasting methods into fine-grained components—including preprocessing, encoding strategies, network architectures, and optimization techniques—and constructs a large-scale component-level performance corpus comprising over 20,000 evaluations through constrained orthogonal experimental design and multi-view analysis across both mainstream and large-scale model architectures. A zero-shot automated component selection method derived from this corpus consistently outperforms state-of-the-art models across multiple datasets, demonstrating that systematic component composition surpasses manually designed monolithic architectures and highlighting the efficacy and superiority of component-level evaluation.
📝 Abstract
While previous research in multivariate time series forecasting has focused on developing complex holistic models, this work advocates for a shift toward a granular, component-level understanding of their impacts. We propose TSCOMP, the first large-scale benchmark that systematically deconstructs deep forecasting methods into their core, fine-grained components--spanning series preprocessing, encoding strategies, network architectures including specific and large time-series models, and optimization methods. Using constrained orthogonal experimental design and extensive evaluations, we conduct multi-view analyses that reveal component effectiveness across different backbones, data characteristics, and their interactions. Beyond providing insights, this benchmark establishes a fine-grained performance corpus comprising over 20,000 model-dataset evaluations, which supports the learning of automated component selection, enabling zero-shot model construction on new datasets. Our experiments demonstrate that the corpus-driven approach, despite its simplicity, consistently outperforms state-of-the-art methods, validating the soundness of our evaluation design and confirming that systematic component selection surpasses manually designed complex architectures. All code and the performance corpus are publicly available at https://github.com/SUFE-AILAB/TSCOMP.
Problem

Research questions and friction points this paper is trying to address.

multivariate time series forecasting
component-level analysis
systematic benchmarking
deep forecasting models
model decomposition
Innovation

Methods, ideas, or system contributions that make the work stand out.

component-level benchmarking
multivariate time-series forecasting
automated model construction
orthogonal experimental design
performance corpus