🤖 AI Summary
Generating 3D molecular structures de novo for drug discovery remains challenging due to the difficulty of simultaneously ensuring 2D topological validity and 3D physical realism. Method: We propose Megalodon, a scalable equivariant Transformer architecture featuring a novel continuous–discrete joint denoising objective that unifies diffusion modeling and flow matching, coupled with an energy-guided 3D structure evaluation benchmark. Technically, it achieves synergistic optimization of 2D topological constraints and 3D conformational physics via geometrically invariant modeling and multi-granularity molecular representation learning. Results: The 40M-parameter model achieves state-of-the-art performance across multiple 3D generation and energy-based benchmarks: it increases the yield of valid large-molecule generations by 49× and reduces the average energy of lowest-energy conformers by 2–10× compared to prior methods.
📝 Abstract
De novo 3D molecule generation is a pivotal task in drug discovery. However, many recent geometric generative models struggle to produce high-quality 3D structures, even if they maintain 2D validity and topological stability. To tackle this issue and enhance the learning of effective molecular generation dynamics, we present Megalodon-a family of scalable transformer models. These models are enhanced with basic equivariant layers and trained using a joint continuous and discrete denoising co-design objective. We assess Megalodon's performance on established molecule generation benchmarks and introduce new 3D structure benchmarks that evaluate a model's capability to generate realistic molecular structures, particularly focusing on energetics. We show that Megalodon achieves state-of-the-art results in 3D molecule generation, conditional structure generation, and structure energy benchmarks using diffusion and flow matching. Furthermore, doubling the number of parameters in Megalodon to 40M significantly enhances its performance, generating up to 49x more valid large molecules and achieving energy levels that are 2-10x lower than those of the best prior generative models.