🤖 AI Summary
This work addresses the challenge of missing smart meter data—often caused by privacy concerns, device failures, or low sampling resolution—by proposing a unified generative framework based on conditional flow matching. Unlike existing approaches that require separate models for tasks such as synthetic data generation, missing data imputation, and super-resolution, this study introduces, for the first time, flow matching to model high-dimensional, 15-minute-resolution monthly electricity consumption time series. The framework treats all these tasks as instances of conditional generation, enabling a single model to flexibly incorporate partial observations as conditions. It produces highly realistic time series while preserving consistency with known measurements, significantly outperforming interpolation and task-specific baselines in both modeling efficiency and result coherence.
📝 Abstract
Smart meter data is the foundation for planning and operating the distribution network. Unfortunately, such data are not always available due to privacy regulations. Meanwhile, the collected data may be corrupted due to sensor or transmission failure, or it may not have sufficient resolution for downstream tasks. A wide range of generative tasks is formulated to address these issues, including synthetic data generation, missing data imputation, and super-resolution. Despite the success of machine learning models on these tasks, dedicated models need to be designed and trained for each task, leading to redundancy and inefficiency. In this paper, by recognizing the powerful modeling capability of flow matching models, we propose a new approach to unify diverse smart meter data generative tasks with a single model trained for conditional generation. The proposed flow matching models are trained to generate challenging, high-dimensional time series data, specifically monthly smart meter data at a 15 min resolution. By viewing different generative tasks as distinct forms of partial data observations and injecting them into the generation process, we unify tasks such as imputation and super-resolution with a single model, eliminating the need for re-training. The data generated by our model not only are consistent with the given observations but also remain realistic, showing better performance against interpolation and other machine learning based baselines dedicated to the tasks.