🤖 AI Summary
To address the challenge of simultaneously ensuring statistical representativeness and physical feasibility in synthetic power flow data generation, this paper proposes a diffusion model incorporating physics-based constraints—particularly Kirchhoff’s laws. Methodologically, it introduces, for the first time, a differentiable gradient-guidance mechanism into the DDPM sampling process to enforce physical constraints implicitly and hard-codedly, integrated with conditional label embedding and tabular-data-specific feature engineering. The key contribution lies in overcoming the limitations of GANs and VAEs in modeling structured physical constraints, enabling end-to-end coupling between generative processes and domain knowledge. Experiments on IEEE benchmark systems demonstrate that the generated flow data achieve over 99.2% feasibility (i.e., satisfy power flow equations and operational limits), while reducing the Wasserstein distance by 37% compared to state-of-the-art methods—significantly improving both statistical fidelity and engineering utility.
📝 Abstract
Growing concerns over privacy, security, and legal barriers are driving the rising demand for synthetic data across domains such as healthcare, finance, and energy. While generative models offer a promising solution to overcome these barriers, their utility depends on the incorporation of domain-specific knowledge. We propose to synthesize data using a guided diffusion model that integrates domain constraints directly into the generative process. We develop the model in the context of power systems, with potential applicability to other domains that involve tabular data. Specifically, we synthesize statistically representative and high-fidelity power flow datasets. To satisfy domain constraints, e.g., Kirchhoff laws, we introduce a gradient-based guidance to steer the sampling trajectory in a feasible direction. Numerical results demonstrate the effectiveness of our approach.