SemlaFlow -- Efficient 3D Molecular Generation with Latent Attention and Equivariant Flow Matching

📅 2024-06-11
📈 Citations: 4
Influential: 1
📄 PDF
🤖 AI Summary
Existing 3D molecular generation methods suffer from slow sampling, low chemical validity, or geometric distortions. This paper introduces the first unconditional E(3)-equivariant flow matching framework for joint generation of molecular graphs (including atom/bond types and formal charges) and precise 3D conformations. Key contributions include: (1) Semla, a scalable E(3)-equivariant message-passing architecture; (2) latent-space attention to enhance structural-geometric co-modeling; and (3) novel benchmark metrics exposing biases in prevailing evaluation protocols. On standard datasets such as GEOM-QM9, our method achieves state-of-the-art performance: it requires only 20 sampling steps—yielding a 100× speedup over the best prior method—while attaining >99.5% chemical validity, significantly reduced bond-angle and dihedral-angle errors, and improved conformational diversity and FCD scores.

Technology Category

Application Category

📝 Abstract
Methods for jointly generating molecular graphs along with their 3D conformations have gained prominence recently due to their potential impact on structure-based drug design. Current approaches, however, often suffer from very slow sampling times or generate molecules with poor chemical validity. Addressing these limitations, we propose Semla, a scalable E(3)-equivariant message passing architecture. We further introduce an unconditional 3D molecular generation model, SemlaFlow, which is trained using equivariant flow matching to generate a joint distribution over atom types, coordinates, bond types and formal charges. Our model produces state-of-the-art results on benchmark datasets with as few as 20 sampling steps, corresponding to a two order-of-magnitude speedup compared to state-of-the-art. Furthermore, we highlight limitations of current evaluation methods for 3D generation and propose new benchmark metrics for unconditional molecular generators. Finally, using these new metrics, we compare our model's ability to generate high quality samples against current approaches and further demonstrate SemlaFlow's strong performance.
Problem

Research questions and friction points this paper is trying to address.

Slow sampling times in 3D molecular generation
Poor chemical validity in generated molecules
Inadequate evaluation methods for 3D molecular generators
Innovation

Methods, ideas, or system contributions that make the work stand out.

E(3)-equivariant message passing architecture
Equivariant flow matching for molecular generation
State-of-the-art results with 20 sampling steps
🔎 Similar Papers
No similar papers found.