The Multiverse of Time Series Machine Learning: an Archive for Multivariate Time Series Classification

📅 2026-03-20

📈 Citations: 0

✨ Influential: 0

career value

224K/year

🤖 AI Summary

This work addresses the lack of large-scale, unified, and diverse benchmarks for multivariate time series classification (MTSC), which has hindered fair algorithm evaluation and comparison. The authors expand the UEA MTSC archive by over fourfold, integrating heterogeneous data sources to construct Multiverse—a benchmark comprising 147 standardized datasets—and introduce MV-core, a subset of 30 core tasks designed to reduce computational overhead. For the first time, the study systematically handles missing values and variable-length sequences, providing a unified preprocessing pipeline, APIs compatible with both aeon and scikit-learn, and an interactive visualization platform. The paper also releases reproducible performance results for a range of classical and state-of-the-art methods, substantially advancing efficient and standardized research in MTSC.

Technology Category

Application Category

📝 Abstract

Time series machine learning (TSML) is a growing research field that spans a wide range of tasks. The popularity of established tasks such as classification, clustering, and extrinsic regression has, in part, been driven by the availability of benchmark datasets. An archive of 30 multivariate time series classification datasets, introduced in 2018 and commonly known as the UEA archive, has since become an essential resource cited in hundreds of publications. We present a substantial expansion of this archive that more than quadruples its size, from 30 to 133 classification problems. We also release preprocessed versions of datasets containing missing values or unequal length series, bringing the total number of datasets to 147. Reflecting the growth of the archive and the broader community, we rebrand it as the Multiverse archive to capture its diversity of domains. The Multiverse archive includes datasets from multiple sources, consolidating other collections and standalone datasets into a single, unified repository. Recognising that running experiments across the full archive is computationally demanding, we recommend a subset of the full archive called Multiverse-core (MV-core) for initial exploration. To support researchers in using the new archive, we provide detailed guidance and a baseline evaluation of established and recent classification algorithms, establishing performance benchmarks for future research. We have created a dedicated repository for the Multiverse archive that provides a common aeon and scikit-learn compatible framework for reproducibility, an extensive record of published results, and an interactive interface to explore the results.

Problem

Research questions and friction points this paper is trying to address.

multivariate time series classification

benchmark datasets

time series machine learning

data archive

reproducibility

Innovation

Methods, ideas, or system contributions that make the work stand out.

multivariate time series classification

benchmark archive

reproducible framework