đ€ AI Summary
Addressing the dual challenges of data scarcity and privacy preservation in long-term electricity consumption forecasting for individual consumers, this paper proposes a solution based on high-fidelity synthetic time-series data generation. Unlike mainstream approaches focused on short-term, system-level forecasting, we systematically evaluate and enhance four generative modelsâWasserstein GAN, denoising diffusion probabilistic models (DDPM), hidden Markov models (HMM), and autoregressive Bernstein polynomial flows (MABF)âexplicitly capturing temporal dynamics, long-range dependencies, and probabilistic state transitions while ensuring full anonymization. Experiments on the German residential electricity consumption dataset demonstrate that the synthetic data faithfully reproduce real consumption patterns and achieve performance on par with real data in both state estimation and long-term load forecasting tasks. This work establishes a scalable, verifiable paradigm for individual-level energy modeling in privacy-sensitive settings.
đ Abstract
Forecasting attracts a lot of research attention in the electricity value chain. However, most studies concentrate on short-term forecasting of generation or consumption with a focus on systems and less on individual consumers. Even more neglected is the topic of long-term forecasting of individual power consumption.
Here, we provide an in-depth comparative evaluation of data-driven methods for generating synthetic time series data tailored to energy consumption long-term forecasting. High-fidelity synthetic data is crucial for a wide range of applications, including state estimations in energy systems or power grid planning. In this study, we assess and compare the performance of multiple state-of-the-art but less common techniques: a hybrid Wasserstein Generative Adversarial Network (WGAN), Denoising Diffusion Probabilistic Model (DDPM), Hidden Markov Model (HMM), and Masked Autoregressive Bernstein polynomial normalizing Flows (MABF). We analyze the ability of each method to replicate the temporal dynamics, long-range dependencies, and probabilistic transitions characteristic of individual energy consumption profiles. Our comparative evaluation highlights the strengths and limitations of: WGAN, DDPM, HMM and MABF aiding in selecting the most suitable approach for state estimations and other energy-related tasks. Our generation and analysis framework aims to enhance the accuracy and reliability of synthetic power consumption data while generating data that fulfills criteria like anonymisation - preserving privacy concerns mitigating risks of specific profiling of single customers. This study utilizes an open-source dataset from households in Germany with 15min time resolution. The generated synthetic power profiles can readily be used in applications like state estimations or consumption forecasting.