DeFoG: Discrete Flow Matching for Graph Generation

📅 2024-10-05
🏛️ arXiv.org
📈 Citations: 2
Influential: 1
📄 PDF
🤖 AI Summary
Graph diffusion models suffer from low sampling efficiency and strong coupling between training and sampling. This paper proposes DeFoG—the first discrete flow matching framework tailored for graph generation—which decouples the training objective from the sampling process to enable efficient and flexible modeling of graph distributions. Methodologically, we introduce discrete flow matching to graph generation for the first time and theoretically establish consistency between the training loss and the sampling algorithm; we design a symmetry-aware graph neural network and a progressive reconstruction mechanism to preserve structural invariance; and we propose a few-step sampling strategy that substantially expands the model design space. DeFoG achieves state-of-the-art performance on synthetic graphs, molecular structures, and digital pathology datasets, surpassing mainstream diffusion models using only 5–10% of their sampling steps.

Technology Category

Application Category

📝 Abstract
Graph generative models are essential across diverse scientific domains by capturing complex distributions over relational data. Among them, graph diffusion models achieve superior performance but face inefficient sampling and limited flexibility due to the tight coupling between training and sampling stages. We introduce DeFoG, a novel graph generative framework that disentangles sampling from training, enabling a broader design space for more effective and efficient model optimization. DeFoG employs a discrete flow-matching formulation that respects the inherent symmetries of graphs. We theoretically ground this disentangled formulation by explicitly relating the training loss to the sampling algorithm and showing that DeFoG faithfully replicates the ground truth graph distribution. Building on these foundations, we thoroughly investigate DeFoG's design space and propose novel sampling methods that significantly enhance performance and reduce the required number of refinement steps. Extensive experiments demonstrate state-of-the-art performance across synthetic, molecular, and digital pathology datasets, covering both unconditional and conditional generation settings. It also outperforms most diffusion-based models with just 5-10% of their sampling steps.
Problem

Research questions and friction points this paper is trying to address.

Disentangles sampling from training in graph generative models.
Improves efficiency and flexibility in graph generation tasks.
Enhances performance with fewer sampling steps compared to diffusion models.
Innovation

Methods, ideas, or system contributions that make the work stand out.

Discrete flow-matching for graph generation
Decouples training and sampling stages
Reduces sampling steps significantly
🔎 Similar Papers
No similar papers found.