Filling the Missings: Spatiotemporal Data Imputation by Conditional Diffusion

📅 2025-06-08

📈 Citations: 0

✨ Influential: 0

career value

200K/year

🤖 AI Summary

Missing spatiotemporal data—e.g., due to sensor failures—severely hampers applications such as environmental monitoring and intelligent transportation. Existing deep learning imputation methods struggle to capture strong spatiotemporal couplings and suffer from error accumulation in iterative estimation. To address these limitations, we propose CoFILL, a conditional diffusion model that abandons iterative refinement and directly generates imputations conforming to the underlying data distribution from noise. CoFILL introduces a novel time-frequency dual-stream neural network to jointly model temporal dynamics and spectral periodicity, and integrates spatiotemporal graph structures to encode both local dependencies and global topology. Crucially, it requires no predefined prior distribution and enables end-to-end robust imputation. Extensive experiments on multiple real-world spatiotemporal datasets demonstrate that CoFILL significantly outperforms state-of-the-art methods, reducing MAE by 12.7%–23.4%. The code is publicly available.

Technology Category

Application Category

📝 Abstract

Missing data in spatiotemporal systems presents a significant challenge for modern applications, ranging from environmental monitoring to urban traffic management. The integrity of spatiotemporal data often deteriorates due to hardware malfunctions and software failures in real-world deployments. Current approaches based on machine learning and deep learning struggle to model the intricate interdependencies between spatial and temporal dimensions effectively and, more importantly, suffer from cumulative errors during the data imputation process, which propagate and amplify through iterations. To address these limitations, we propose CoFILL, a novel Conditional Diffusion Model for spatiotemporal data imputation. CoFILL builds on the inherent advantages of diffusion models to generate high-quality imputations without relying on potentially error-prone prior estimates. It incorporates an innovative dual-stream architecture that processes temporal and frequency domain features in parallel. By fusing these complementary features, CoFILL captures both rapid fluctuations and underlying patterns in the data, which enables more robust imputation. The extensive experiments reveal that CoFILL's noise prediction network successfully transforms random noise into meaningful values that align with the true data distribution. The results also show that CoFILL outperforms state-of-the-art methods in imputation accuracy. The source code is publicly available at https://github.com/joyHJL/CoFILL.

Problem

Research questions and friction points this paper is trying to address.

Imputing missing spatiotemporal data accurately

Reducing cumulative errors in data imputation

Modeling spatial-temporal dependencies effectively

Innovation

Methods, ideas, or system contributions that make the work stand out.

Conditional Diffusion Model for imputation

Dual-stream architecture processes features

Noise prediction network transforms values

🔎 Similar Papers

A Survey on Diffusion Models for Time Series and Spatio-Temporal Data