🤖 AI Summary
To address the modeling challenge of cross-domain time-series heterogeneity—characterized by multi-source, multimodal, variable-sampling-rate, and multivariate properties—this paper introduces OTiS, the first open foundational time-series model. Methodologically, we propose a novel learnable domain-signature tokenizer for input-adaptive tokenization, and design a dual masking pretraining strategy coupled with a normalized cross-correlation loss to jointly disentangle temporal structure, multiscale dynamics, and cross-variable dependencies. Our self-supervised framework integrates domain-aware tokenization, temporal masking modeling, and contrastive representation learning. Evaluated across 12 downstream tasks—including classification, regression, and forecasting—OTiS consistently outperforms state-of-the-art methods, demonstrating substantial gains in cross-domain generalization. The model architecture, training code, and pretrained weights are publicly released.
📝 Abstract
Recent breakthroughs in natural language processing and computer vision, driven by efficient pre-training on large datasets, have enabled foundation models to excel on a wide range of tasks. However, this potential has not yet been fully realised in time series analysis, as existing methods fail to address the heterogeneity in large time series corpora. Prevalent in domains ranging from medicine to finance, time series vary substantially in characteristics such as variate count, inter-variate relationships, temporal patterns, and sampling frequency. To address this, we introduce a novel pre-training paradigm specifically designed to handle time series heterogeneity. We propose a tokeniser with learnable domain signatures, a dual masking strategy, and a normalised cross-correlation loss, enabling our open model for general time series analysis (OTiS) to efficiently learn from large time series corpora. Extensive benchmarking on diverse tasks, such as classification, regression, and forecasting, demonstrates that OTiS outperforms state-of-the-art baselines. Our code and pre-trained weights are available at https://github.com/oetu/otis.