MolSnap: Snap-Fast Molecular Generation with Latent Variational Mean Flow

📅 2025-08-07

📈 Citations: 0

✨ Influential: 0

career value

197K/year

🤖 AI Summary

Addressing the challenge of simultaneously achieving high molecular quality, diversity, and inference efficiency in text-to-molecule generation, this paper proposes CAT-VMF, a causal-aware molecular generation framework. First, it introduces the Causal-Aware Transformer (CAT) to explicitly model causal dependencies between textual inputs and molecular graph structures. Second, it develops a Variational Mean-Field (VMF) model with a mixture-of-Gaussians prior to enhance latent-space expressiveness and enable one-step, efficient sampling. Evaluated on four standard benchmarks, CAT-VMF achieves state-of-the-art performance: 74.5% novelty, 70.3% diversity, and 100% molecular validity, while requiring only a single function evaluation for conditional generation—substantially outperforming diffusion-based approaches. The core contribution lies in the novel integration of causal modeling with flow-based variational inference, enabling, for the first time, scalable, single-step, text-conditioned molecular generation without compromising generation quality.

Technology Category

Application Category

📝 Abstract

Molecular generation conditioned on textual descriptions is a fundamental task in computational chemistry and drug discovery. Existing methods often struggle to simultaneously ensure high-quality, diverse generation and fast inference. In this work, we propose a novel causality-aware framework that addresses these challenges through two key innovations. First, we introduce a Causality-Aware Transformer (CAT) that jointly encodes molecular graph tokens and text instructions while enforcing causal dependencies during generation. Second, we develop a Variational Mean Flow (VMF) framework that generalizes existing flow-based methods by modeling the latent space as a mixture of Gaussians, enhancing expressiveness beyond unimodal priors. VMF enables efficient one-step inference while maintaining strong generation quality and diversity. Extensive experiments on four standard molecular benchmarks demonstrate that our model outperforms state-of-the-art baselines, achieving higher novelty (up to 74.5%), diversity (up to 70.3%), and 100% validity across all datasets. Moreover, VMF requires only one number of function evaluation (NFE) during conditional generation and up to five NFEs for unconditional generation, offering substantial computational efficiency over diffusion-based methods.

Problem

Research questions and friction points this paper is trying to address.

Generating high-quality diverse molecules from text descriptions

Ensuring fast inference in molecular generation models

Overcoming limitations of unimodal priors in flow-based methods

Innovation

Methods, ideas, or system contributions that make the work stand out.

Causality-Aware Transformer for joint encoding

Variational Mean Flow for efficient inference

Gaussian mixture latent space modeling

🔎 Similar Papers

No similar papers found.