Multivariate Time Series Data Imputation via Distributionally Robust Regularization

📅 2026-01-31

📈 Citations: 0

✨ Influential: 0

career value

212K/year

🤖 AI Summary

This work addresses the challenges of distributional shift, bias, and overfitting in multivariate time series imputation caused by non-stationarity and systematic missingness. To this end, we propose the Distributionally Robust Imputation Objective (DRIO), which introduces distributionally robust optimization into time series imputation for the first time. DRIO models the uncertainty between the observed and true data distributions using a Wasserstein ambiguity set and jointly minimizes the reconstruction error and the worst-case distributional divergence. Leveraging duality theory, the infinite-dimensional measure optimization is transformed into an adversarial search over observed sample trajectories, enabling the design of an adversarial learning algorithm compatible with deep neural networks. Extensive experiments on real-world datasets demonstrate that DRIO consistently achieves significant performance gains under both missing-at-random and non-random missing scenarios, attaining a Pareto-optimal trade-off between reconstruction accuracy and distributional consistency.

Technology Category

Application Category

📝 Abstract

Multivariate time series (MTS) imputation is often compromised by mismatch between observed and true data distributions -- a bias exacerbated by non-stationarity and systematic missingness. Standard methods that minimize reconstruction error or encourage distributional alignment risk overfitting these biased observations. We propose the Distributionally Robust Regularized Imputer Objective (DRIO), which jointly minimizes reconstruction error and the divergence between the imputer and a worst-case distribution within a Wasserstein ambiguity set. We derive a tractable dual formulation that reduces infinite-dimensional optimization over measures to adversarial search over sample trajectories, and propose an adversarial learning algorithm compatible with flexible deep learning backbones. Comprehensive experiments on diverse real-world datasets show DRIO consistently improves imputation under both missing-completely-at-random and missing-not-at-random settings, reaching Pareto-optimal trade-offs between reconstruction accuracy and distributional alignment.

Problem

Research questions and friction points this paper is trying to address.

Multivariate Time Series

Data Imputation

Distributional Mismatch

Non-stationarity

Systematic Missingness

Innovation

Methods, ideas, or system contributions that make the work stand out.

Distributionally Robust Optimization

Time Series Imputation

Wasserstein Ambiguity Set