Do Graph Diffusion Models Accurately Capture and Generate Substructure Distributions?

📅 2025-02-04

📈 Citations: 0

✨ Influential: 0

career value

181K/year

🤖 AI Summary

Existing graph diffusion models struggle to accurately capture the frequency distributions of substructures (e.g., triangles, 4-cycles) present in training data, resulting in low structural fidelity of generated graphs. To address this, we establish, for the first time, a theoretical connection between the expressive power of Graph Neural Networks (GNNs) and the performance of graph diffusion generative modeling, proposing substructure distribution as a key evaluation metric—and proving that higher-GNN expressivity significantly improves distributional consistency. Methodologically, we integrate highly expressive GNNs (e.g., 3-WL-equivalent models) into the diffusion backbone and couple them with a substructure-statistics-based evaluation and interpretability framework. On multiple benchmark datasets, our approach reduces substructure count distribution error by over 40%, while markedly improving both structural diversity and alignment with ground-truth distributions. This work provides a novel theoretical perspective and a practical paradigm for structure-aware graph generation.

Technology Category

Application Category

📝 Abstract

Diffusion models have gained popularity in graph generation tasks; however, the extent of their expressivity concerning the graph distributions they can learn is not fully understood. Unlike models in other domains, popular backbones for graph diffusion models, such as Graph Transformers, do not possess universal expressivity to accurately model the distribution scores of complex graph data. Our work addresses this limitation by focusing on the frequency of specific substructures as a key characteristic of target graph distributions. When evaluating existing models using this metric, we find that they fail to maintain the distribution of substructure counts observed in the training set when generating new graphs. To address this issue, we establish a theoretical connection between the expressivity of Graph Neural Networks (GNNs) and the overall performance of graph diffusion models, demonstrating that more expressive GNN backbones can better capture complex distribution patterns. By integrating advanced GNNs into the backbone architecture, we achieve significant improvements in substructure generation.

Problem

Research questions and friction points this paper is trying to address.

Graph diffusion models' expressivity limitations

Inaccurate substructure distribution in generated graphs

Enhancing GNNs for better graph generation

Innovation

Methods, ideas, or system contributions that make the work stand out.

Graph Diffusion Models

Substructure Frequency Focus

Advanced GNN Integration

🔎 Similar Papers

Random Walk Diffusion for Efficient Large-Scale Graph Generation