๐ค AI Summary
This work proposes ConGLUDe, a novel framework that unifies protein structure and ligand information within a single model through contrastive geometric learning, overcoming the limitations of traditional drug design approaches that treat these data sources in isolation and rely on predefined binding pockets. By jointly training on proteinโligand complexes and large-scale bioactivity data, ConGLUDe employs a geometric protein encoder to generate embeddings for full-length proteins and implicit binding sites, coupled with an efficient ligand encoder. This enables zero-shot virtual screening, target identification, and ligand-guided binding site prediction without requiring pre-specified pockets. Experimental results demonstrate that ConGLUDe significantly outperforms existing methods in zero-shot virtual screening and target identification tasks, while accurately predicting ligand-conditioned binding pockets.
๐ Abstract
Structure-based and ligand-based computational drug design have traditionally relied on disjoint data sources and modeling assumptions, limiting their joint use at scale. In this work, we introduce Contrastive Geometric Learning for Unified Computational Drug Design (ConGLUDe), a single contrastive geometric model that unifies structure- and ligand-based training. ConGLUDe couples a geometric protein encoder that produces whole-protein representations and implicit embeddings of predicted binding sites with a fast ligand encoder, removing the need for pre-defined pockets. By aligning ligands with both global protein representations and multiple candidate binding sites through contrastive learning, ConGLUDe supports ligand-conditioned pocket prediction in addition to virtual screening and target fishing, while being trained jointly on protein-ligand complexes and large-scale bioactivity data. Across diverse benchmarks, ConGLUDe achieves competitive zero-shot virtual screening performance, substantially outperforms existing methods on a challenging target fishing task, and demonstrates state-of-the-art ligand-conditioned pocket selection. These results highlight the advantages of unified structure-ligand training and position ConGLUDe as a step toward general-purpose foundation models for drug discovery.