FragFM: Efficient Fragment-Based Molecular Generation via Discrete Flow Matching

📅 2025-02-19
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address low chemical validity, insufficient structural diversity, excessive sampling steps, and high computational complexity in molecular graph generation, this paper introduces the first fragment-based discrete flow matching framework. Methodologically, we propose a hierarchical autoencoder architecture that maps coarse-grained (fragment-level) representations to fine-grained (atom-level) ones, encoding molecules as invertible hierarchical graphs and performing discrete flow matching directly in fragment space to jointly model chemical constraints and structural priors. Compared with continuous-flow or autoregressive approaches, our framework reduces sampling steps by 10–50× while achieving >99% validity. Evaluated on ZINC, MOSES, and natural product benchmarks, the model maintains high structural diversity and precise property control, demonstrating superior efficiency, scalability, and chemical plausibility.

Technology Category

Application Category

📝 Abstract
We introduce FragFM, a novel fragment-based discrete flow matching framework for molecular graph generation.FragFM generates molecules at the fragment level, leveraging a coarse-to-fine autoencoding mechanism to reconstruct atom-level details. This approach reduces computational complexity while maintaining high chemical validity, enabling more efficient and scalable molecular generation. We benchmark FragFM against state-of-the-art diffusion- and flow-based models on standard molecular generation benchmarks and natural product datasets, demonstrating superior performance in validity, property control, and sampling efficiency. Notably, FragFM achieves over 99% validity with significantly fewer sampling steps, improving scalability while preserving molecular diversity. These results highlight the potential of fragment-based generative modeling for large-scale, property-aware molecular design, paving the way for more efficient exploration of chemical space.
Problem

Research questions and friction points this paper is trying to address.

Fragment-based molecular generation
Reduces computational complexity
Improves scalability and validity
Innovation

Methods, ideas, or system contributions that make the work stand out.

Fragment-based discrete flow matching
Coarse-to-fine autoencoding mechanism
High chemical validity with fewer steps
🔎 Similar Papers
No similar papers found.