🤖 AI Summary
Current structure-based drug design (SBDD) methods typically treat molecular graphs (discrete modality) and 3D coordinates (continuous modality) separately, neglecting their intrinsic multimodal coupling and causal dependencies—thereby compromising the chemical validity and binding efficacy of generated molecules. To address this, we propose the first causally aware hybrid-modal generative framework that jointly models graph topology and 3D conformation, explicitly enforcing inter-modal causal constraints. Our approach innovatively integrates an autoregressive Transformer for graph generation with a 3D diffusion model for coordinate sampling, and introduces a protein–ligand hybrid sequence encoding scheme to ensure causal alignment across modalities. Evaluated on the CrossDocked2020 benchmark, our method achieves state-of-the-art performance, outperforming all existing approaches in molecular validity, conformational diversity, and consistency with docking affinity predictions.
📝 Abstract
Structure-based drug design (SBDD) is a critical task in drug discovery, requiring the generation of molecular information across two distinct modalities: discrete molecular graphs and continuous 3D coordinates. However, existing SBDD methods often overlook two key challenges: (1) the multi-modal nature of this task and (2) the causal relationship between these modalities, limiting their plausibility and performance. To address both challenges, we propose TransDiffSBDD, an integrated framework combining autoregressive transformers and diffusion models for SBDD. Specifically, the autoregressive transformer models discrete molecular information, while the diffusion model samples continuous distributions, effectively resolving the first challenge. To address the second challenge, we design a hybrid-modal sequence for protein-ligand complexes that explicitly respects the causality between modalities. Experiments on the CrossDocked2020 benchmark demonstrate that TransDiffSBDD outperforms existing baselines.