TimesBERT: A BERT-Style Foundation Model for Time Series Understanding

📅 2025-02-28
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Time-series understanding tasks—including classification, imputation, and anomaly detection—lack a universal representation model. Method: We propose the first BERT-style foundation model for multi-task time-series understanding. It introduces functional token prediction as a novel parallel pretraining objective to explicitly capture multi-granularity structural patterns and temporal dynamics. We design a mapping mechanism that transforms multivariate time series into document-like structures and establish a dual-task pretraining framework combining masked modeling and functional token prediction. The model is pretrained on 260 billion cross-domain time-series data points. Contribution/Results: Extensive experiments demonstrate consistent superiority over task-specific models and language-model transfer baselines across four downstream understanding tasks. This work marks the first successful adaptation of the BERT architecture to general-purpose time-series representation learning, establishing the first universal foundation model for time-series understanding.

Technology Category

Application Category

📝 Abstract
Time series analysis is crucial in diverse scenarios. Beyond forecasting, considerable real-world tasks are categorized into classification, imputation, and anomaly detection, underscoring different capabilities termed time series understanding in this paper. While GPT-style models have been positioned as foundation models for time series forecasting, the BERT-style architecture, which has made significant advances in natural language understanding, has not been fully unlocked for time series understanding, possibly attributed to the undesirable dropout of essential elements of BERT. In this paper, inspired by the shared multi-granularity structure between multivariate time series and multisentence documents, we design TimesBERT to learn generic representations of time series including temporal patterns and variate-centric characteristics. In addition to a natural adaptation of masked modeling, we propose a parallel task of functional token prediction to embody vital multi-granularity structures. Our model is pre-trained on 260 billion time points across diverse domains. Leveraging multi-granularity representations, TimesBERT achieves state-of-the-art performance across four typical downstream understanding tasks, outperforming task-specific models and language pre-trained backbones, positioning it as a versatile foundation model for time series understanding.
Problem

Research questions and friction points this paper is trying to address.

Develop TimesBERT for time series understanding tasks.
Address limitations of BERT-style models in time series analysis.
Achieve state-of-the-art performance in classification, imputation, and anomaly detection.
Innovation

Methods, ideas, or system contributions that make the work stand out.

BERT-style architecture for time series
Masked modeling and functional token prediction
Pre-trained on 260 billion time points
🔎 Similar Papers
No similar papers found.