The Multiverse of Time Series Machine Learning: an Archive for Multivariate Time Series Classification

📅 2026-03-20
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the lack of large-scale, unified, and diverse benchmarks for multivariate time series classification (MTSC), which has hindered fair algorithm evaluation and comparison. The authors expand the UEA MTSC archive by over fourfold, integrating heterogeneous data sources to construct Multiverse—a benchmark comprising 147 standardized datasets—and introduce MV-core, a subset of 30 core tasks designed to reduce computational overhead. For the first time, the study systematically handles missing values and variable-length sequences, providing a unified preprocessing pipeline, APIs compatible with both aeon and scikit-learn, and an interactive visualization platform. The paper also releases reproducible performance results for a range of classical and state-of-the-art methods, substantially advancing efficient and standardized research in MTSC.

Technology Category

Application Category

📝 Abstract
Time series machine learning (TSML) is a growing research field that spans a wide range of tasks. The popularity of established tasks such as classification, clustering, and extrinsic regression has, in part, been driven by the availability of benchmark datasets. An archive of 30 multivariate time series classification datasets, introduced in 2018 and commonly known as the UEA archive, has since become an essential resource cited in hundreds of publications. We present a substantial expansion of this archive that more than quadruples its size, from 30 to 133 classification problems. We also release preprocessed versions of datasets containing missing values or unequal length series, bringing the total number of datasets to 147. Reflecting the growth of the archive and the broader community, we rebrand it as the Multiverse archive to capture its diversity of domains. The Multiverse archive includes datasets from multiple sources, consolidating other collections and standalone datasets into a single, unified repository. Recognising that running experiments across the full archive is computationally demanding, we recommend a subset of the full archive called Multiverse-core (MV-core) for initial exploration. To support researchers in using the new archive, we provide detailed guidance and a baseline evaluation of established and recent classification algorithms, establishing performance benchmarks for future research. We have created a dedicated repository for the Multiverse archive that provides a common aeon and scikit-learn compatible framework for reproducibility, an extensive record of published results, and an interactive interface to explore the results.
Problem

Research questions and friction points this paper is trying to address.

multivariate time series classification
benchmark datasets
time series machine learning
data archive
reproducibility
Innovation

Methods, ideas, or system contributions that make the work stand out.

multivariate time series classification
benchmark archive
reproducible framework
missing value handling
algorithm benchmarking
🔎 Similar Papers
No similar papers found.
M
Matthew Middlehurst
University of Bradford
A
Aiden Rushbrooke
University of East Anglia
A
Ali Ismail-Fawaz
Université de Haute-Alsace
Maxime Devanne
Maxime Devanne
Associate Professor of Computer Science, Université de Haute Alsace
Machine LearningTime Series3D Human MotionComputer VisionShape Analysis
Germain Forestier
Germain Forestier
Full Professor of Computer Science, Université de Haute Alsace
Time SeriesMachine LearningData MiningArtificial IntelligenceDeep Learning
A
Angus Dempster
Monash University
G
Geoffrey I. Webb
Monash University
C
Christopher Holder
University of Southampton
A
Anthony Bagnall
University of Southampton