Temporal Wasserstein Imputation: Versatile Missing Data Imputation for Time Series

📅 2024-11-05
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Non-systematic missingness in time series—particularly under nonlinear dynamics and boundary constraints—compromises downstream statistical analysis. Method: We propose a fully nonparametric imputation framework driven by the Wasserstein distance, introducing for the first time a time-aware Wasserstein metric for missing data imputation. Our approach models dynamic dependencies via time-coupled transport costs, accommodates both univariate and multivariate settings, and incorporates prior knowledge (e.g., box constraints) directly into the optimization. We employ an alternating minimization algorithm, with theoretical guarantees on convergence and marginal distribution identifiability, effectively mitigating distributional shift. Results: Evaluated on synthetic multivariate nonlinear time series and real-world groundwater data, our method achieves superior imputation accuracy and distributional fidelity compared to state-of-the-art approaches, significantly enhancing robustness in downstream forecasting and hypothesis testing.

Technology Category

Application Category

📝 Abstract
Missing data can significantly hamper standard time series analysis, yet in practice they are frequently encountered. In this paper, we introduce temporal Wasserstein imputation, a novel method for imputing missing data in time series. Unlike existing techniques, our approach is fully nonparametric, circumventing the need for model specification prior to imputation, making it suitable for potential nonlinear dynamics. Its principled algorithmic implementation can seamlessly handle univariate or multivariate time series with any non-systematic missing pattern. In addition, the plausible range and side information of the missing entries (such as box constraints) can easily be incorporated. As a key advantage, our method mitigates the distributional bias typical of many existing approaches, ensuring more reliable downstream statistical analysis using the imputed series. Leveraging the benign landscape of the optimization formulation, we establish the convergence of an alternating minimization algorithm to critical points. We also provide conditions under which the marginal distributions of the underlying time series can be identified. Numerical experiments, including extensive simulations covering linear and nonlinear time series models and a real-world groundwater dataset laden with missing values, corroborate the practical usefulness of the proposed method.
Problem

Research questions and friction points this paper is trying to address.

Handles missing data in time series nonparametrically
Works for univariate or multivariate non-systematic patterns
Reduces distributional bias for reliable downstream analysis
Innovation

Methods, ideas, or system contributions that make the work stand out.

Nonparametric temporal Wasserstein imputation method
Handles univariate and multivariate missing patterns
Mitigates distributional bias in imputed series
🔎 Similar Papers
No similar papers found.
S
Shuo-Chieh Huang
Department of Statistics, Rutgers University
Tengyuan Liang
Tengyuan Liang
Professor, University of Chicago
R
Ruey S. Tsay
Booth School of Business, University of Chicago