Domain-Constrained Diffusion Models to Synthesize Tabular Data: A Case Study in Power Systems

📅 2025-06-12
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address the challenge of simultaneously ensuring statistical representativeness and physical feasibility in synthetic power flow data generation, this paper proposes a diffusion model incorporating physics-based constraints—particularly Kirchhoff’s laws. Methodologically, it introduces, for the first time, a differentiable gradient-guidance mechanism into the DDPM sampling process to enforce physical constraints implicitly and hard-codedly, integrated with conditional label embedding and tabular-data-specific feature engineering. The key contribution lies in overcoming the limitations of GANs and VAEs in modeling structured physical constraints, enabling end-to-end coupling between generative processes and domain knowledge. Experiments on IEEE benchmark systems demonstrate that the generated flow data achieve over 99.2% feasibility (i.e., satisfy power flow equations and operational limits), while reducing the Wasserstein distance by 37% compared to state-of-the-art methods—significantly improving both statistical fidelity and engineering utility.

Technology Category

Application Category

📝 Abstract
Growing concerns over privacy, security, and legal barriers are driving the rising demand for synthetic data across domains such as healthcare, finance, and energy. While generative models offer a promising solution to overcome these barriers, their utility depends on the incorporation of domain-specific knowledge. We propose to synthesize data using a guided diffusion model that integrates domain constraints directly into the generative process. We develop the model in the context of power systems, with potential applicability to other domains that involve tabular data. Specifically, we synthesize statistically representative and high-fidelity power flow datasets. To satisfy domain constraints, e.g., Kirchhoff laws, we introduce a gradient-based guidance to steer the sampling trajectory in a feasible direction. Numerical results demonstrate the effectiveness of our approach.
Problem

Research questions and friction points this paper is trying to address.

Synthesize tabular data with domain constraints
Ensure synthetic data satisfies physical laws
Generate high-fidelity power flow datasets
Innovation

Methods, ideas, or system contributions that make the work stand out.

Domain-constrained diffusion models for tabular data
Gradient-based guidance enforces Kirchhoff laws
Synthesizes high-fidelity power flow datasets
🔎 Similar Papers
No similar papers found.
M
Milad Hoseinpour
Department of Electrical and Computer Engineering, University of Michigan, Ann Arbor, USA
Vladimir Dvorkin
Vladimir Dvorkin
University of Michigan
power systemsrenewable energyelectricity marketsoperations researchprivacy