TABASCO: A Fast, Simplified Model for Molecular Generation with Improved Physical Quality

📅 2025-07-01

📈 Citations: 0

✨ Influential: 0

career value

201K/year

🤖 AI Summary

Existing 3D molecular generation models incorporate SE(3)-equivariance and graph-based message passing but suffer from limited physical plausibility and low inference efficiency. This work proposes a minimalist, non-equivariant Transformer architecture that treats atoms as a sequence—omitting both SE(3) constraints and explicit message passing—and reconstructs covalent bonds post-generation via deterministic chemical valence rules. To our knowledge, this is the first purely sequential 3D molecular generator without equivariance or message passing. Remarkably, rotational equivariance emerges implicitly during training, substantially improving geometric validity and chemical feasibility. On GEOM-Drugs, our method achieves state-of-the-art PoseBusters validity while accelerating inference by approximately 10× over the strongest baseline. The approach thus delivers unprecedented efficiency alongside high structural and chemical realism, making it well-suited for structure-based and pharmacophore-driven drug design.

Technology Category

Application Category

📝 Abstract

State-of-the-art models for 3D molecular generation are based on significant inductive biases, SE(3), permutation equivariance to respect symmetry and graph message-passing networks to capture local chemistry, yet the generated molecules still struggle with physical plausibility. We introduce TABASCO which relaxes these assumptions: The model has a standard non-equivariant transformer architecture, treats atoms in a molecule as sequences and reconstructs bonds deterministically after generation. The absence of equivariant layers and message passing allows us to significantly simplify the model architecture and scale data throughput. On the GEOM-Drugs benchmark TABASCO achieves state-of-the-art PoseBusters validity and delivers inference roughly 10x faster than the strongest baseline, while exhibiting emergent rotational equivariance despite symmetry not being hard-coded. Our work offers a blueprint for training minimalist, high-throughput generative models suited to specialised tasks such as structure- and pharmacophore-based drug design. We provide a link to our implementation at github.com/carlosinator/tabasco.

Problem

Research questions and friction points this paper is trying to address.

Improves physical plausibility in 3D molecular generation

Simplifies model architecture without equivariant layers

Enables faster inference for drug design tasks

Innovation

Methods, ideas, or system contributions that make the work stand out.

Uses non-equivariant transformer architecture

Treats atoms as sequences deterministically

Simplifies model for high throughput

🔎 Similar Papers

No similar papers found.