Rethinking Diffusion Models with Symmetries through Canonicalization with Applications to Molecular Graph Generation

📅 2026-02-16
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the challenge of modeling invariant distributions under group symmetries—such as permutations and rotations—in tasks like molecular graph generation. The authors propose a quotient-space-based generative framework that maps samples to orbit representatives, enabling unconstrained diffusion or flow models to be trained on a canonical slice. Invariance is restored during generation by applying random symmetry transformations. This approach avoids architectural constraints, and the authors provide theoretical guarantees of its correctness, universality, and enhanced expressiveness. It also reduces score function complexity and accelerates training. Combined with geometric spectral normalization, lightweight positional encoding, and the Canon architecture, the method achieves state-of-the-art 3D molecular generation performance on GEOM-DRUG, significantly outperforming equivariant baselines—even with fewer sampling steps—while maintaining comparable or lower computational cost.

Technology Category

Application Category

📝 Abstract
Many generative tasks in chemistry and science involve distributions invariant to group symmetries (e.g., permutation and rotation). A common strategy enforces invariance and equivariance through architectural constraints such as equivariant denoisers and invariant priors. In this paper, we challenge this tradition through the alternative canonicalization perspective: first map each sample to an orbit representative with a canonical pose or order, train an unconstrained (non-equivariant) diffusion or flow model on the canonical slice, and finally recover the invariant distribution by sampling a random symmetry transform at generation time. Building on a formal quotient-space perspective, our work provides a comprehensive theory of canonical diffusion by proving: (i) the correctness, universality and superior expressivity of canonical generative models over invariant targets; (ii) canonicalization accelerates training by removing diffusion score complexity induced by group mixtures and reducing conditional variance in flow matching. We then show that aligned priors and optimal transport act complementarily with canonicalization and further improves training efficiency. We instantiate the framework for molecular graph generation under $S_n \times SE(3)$ symmetries. By leveraging geometric spectra-based canonicalization and mild positional encodings, canonical diffusion significantly outperforms equivariant baselines in 3D molecule generation tasks, with similar or even less computation. Moreover, with a novel architecture Canon, CanonFlow achieves state-of-the-art performance on the challenging GEOM-DRUG dataset, and the advantage remains large in few-step generation.
Problem

Research questions and friction points this paper is trying to address.

symmetry
molecular graph generation
invariant distribution
group actions
3D molecule generation
Innovation

Methods, ideas, or system contributions that make the work stand out.

canonicalization
diffusion models
symmetry
molecular graph generation
quotient space
🔎 Similar Papers
No similar papers found.