๐ค AI Summary
This work addresses โpartial blackoutโโa previously unexplored and complex missingness pattern in multivariate time series wherein certain features are consecutively missing across multiple timesteps.
Method: We propose a two-stage self-attention diffusion model that formally defines partial blackout and synergistically integrates Transformer-based self-attention with denoising diffusion probabilistic models (DDPMs). Our approach introduces mask-aware training and spatiotemporal decoupled reconstruction to enable end-to-end robust modeling of incomplete sequences.
Contribution/Results: (1) We are the first to formally identify and systematically resolve the partial blackout problem; (2) our model achieves significantly improved generalization and robustness under diverse missingness patterns; (3) on multiple benchmark and real-world industrial datasets, it reduces average MAE by 18.7% over state-of-the-art methods, while maintaining high scalability and computational efficiency.
๐ Abstract
Missing values in multivariate time series data can harm machine learning performance and introduce bias. These gaps arise from sensor malfunctions, blackouts, and human error and are typically addressed by data imputation. Previous work has tackled the imputation of missing data in random, complete blackouts and forecasting scenarios. The current paper addresses a more general missing pattern, which we call"partial blackout,"where a subset of features is missing for consecutive time steps. We introduce a two-stage imputation process using self-attention and diffusion processes to model feature and temporal correlations. Notably, our model effectively handles missing data during training, enhancing adaptability and ensuring reliable imputation and performance, even with incomplete datasets. Our experiments on benchmark and two real-world time series datasets demonstrate that our model outperforms the state-of-the-art in partial blackout scenarios and shows better scalability.