🤖 AI Summary
Existing pre-trained models for structure-based drug discovery (SBDD) neglect cross-domain interactions between proteins and ligands. Method: We propose BIT, a universal foundation model that unifies representation learning for small molecules, proteins, and their complexes (2D/3D). BIT introduces two novel mixture-of-experts mechanisms: Mixture-of-Domain-Experts (MoDE) to capture cross-domain semantics, and Mixture-of-Structure-Experts (MoSE) to encode multi-scale structural features. Integrated with geometry-aware graph neural networks, 3D coordinate encoding, and domain-adaptive attention, BIT employs multi-task self-supervised denoising pre-training atop a shared Transformer backbone. Contribution/Results: BIT achieves state-of-the-art performance across binding affinity prediction, virtual screening, and molecular property prediction—demonstrating substantial improvements in modeling complex protein–ligand interactions and establishing new benchmarks for SBDD foundation models.
📝 Abstract
Structure-based drug discovery (SBDD) is a systematic scientific process that develops new drugs by leveraging the detailed physical structure of the target protein. Recent advancements in pre-trained models for biomolecules have demonstrated remarkable success across various biochemical applications, including drug discovery and protein engineering. However, in most approaches, the pre-trained models primarily focus on the characteristics of either small molecules or proteins, without delving into their binding interactions which are essential cross-domain relationships pivotal to SBDD. To fill this gap, we propose a general-purpose foundation model named BIT (an abbreviation for Biomolecular Interaction Transformer), which is capable of encoding a range of biochemical entities, including small molecules, proteins, and protein-ligand complexes, as well as various data formats, encompassing both 2D and 3D structures. Specifically, we introduce Mixture-of-Domain-Experts (MoDE) to handle the biomolecules from diverse biochemical domains and Mixture-of-Structure-Experts (MoSE) to capture positional dependencies in the molecular structures. The proposed mixture-of-experts approach enables BIT to achieve both deep fusion and domain-specific encoding, effectively capturing fine-grained molecular interactions within protein-ligand complexes. Then, we perform cross-domain pre-training on the shared Transformer backbone via several unified self-supervised denoising tasks. Experimental results on various benchmarks demonstrate that BIT achieves exceptional performance in downstream tasks, including binding affinity prediction, structure-based virtual screening, and molecular property prediction.