SigmaDock: Untwisting Molecular Docking With Fragment-Based SE(3) Diffusion

📅 2025-11-06
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address three key challenges in molecular docking—chemically implausible ligand poses, poor generalization across unseen targets, and high computational cost—this work introduces the first SE(3)-manifold diffusion-based fragmental generation model. Methodologically, we propose a chemistry-aware rigid fragment decomposition scheme that partitions ligands into topologically stable subunits; these fragments are then reassembled within the protein binding pocket via an equivariant denoising framework incorporating geometric priors on SE(3). Our model achieves state-of-the-art performance on the PoseBusters benchmark, surpassing traditional physics-based methods for the first time: it attains a 79.9% Top-1 success rate (RMSD < 2 Å and PB-valid), outperforming existing deep learning models by 12.7–30.8 percentage points. Moreover, it demonstrates strong zero-shot generalization to unseen protein targets, validating its robustness and scalability. This represents a significant advance in geometry-aware, generative modeling for structure-based drug design.

Technology Category

Application Category

📝 Abstract
Determining the binding pose of a ligand to a protein, known as molecular docking, is a fundamental task in drug discovery. Generative approaches promise faster, improved, and more diverse pose sampling than physics-based methods, but are often hindered by chemically implausible outputs, poor generalisability, and high computational cost. To address these challenges, we introduce a novel fragmentation scheme, leveraging inductive biases from structural chemistry, to decompose ligands into rigid-body fragments. Building on this decomposition, we present SigmaDock, an SE(3) Riemannian diffusion model that generates poses by learning to reassemble these rigid bodies within the binding pocket. By operating at the level of fragments in SE(3), SigmaDock exploits well-established geometric priors while avoiding overly complex diffusion processes and unstable training dynamics. Experimentally, we show SigmaDock achieves state-of-the-art performance, reaching Top-1 success rates (RMSD<2&PB-valid) above 79.9% on the PoseBusters set, compared to 12.7-30.8% reported by recent deep learning approaches, whilst demonstrating consistent generalisation to unseen proteins. SigmaDock is the first deep learning approach to surpass classical physics-based docking under the PB train-test split, marking a significant leap forward in the reliability and feasibility of deep learning for molecular modelling.
Problem

Research questions and friction points this paper is trying to address.

Generating chemically plausible ligand poses for molecular docking
Overcoming poor generalizability in deep learning docking methods
Reducing computational costs while maintaining high accuracy
Innovation

Methods, ideas, or system contributions that make the work stand out.

Fragment-based ligand decomposition scheme
SE(3) Riemannian diffusion model assembly
Geometric priors avoiding complex diffusion processes
A
Alvaro Prat
Department of Statistics, University of Oxford
L
Leo Zhang
Department of Statistics, University of Oxford
C
Charlotte M. Deane
Department of Statistics, University of Oxford
Y
Y. W. Teh
Department of Statistics, University of Oxford
Garrett M. Morris
Garrett M. Morris
University of Oxford
Computational ChemistryComputer-Aided Drug DesignVirtual ScreeningDockingMachine Learning & AI