🤖 AI Summary
Protein–ligand binding prediction faces challenges in modeling the hierarchical nature of molecular interactions and fine-grained affinity differences using Euclidean embeddings. To address this, we propose HypSeek—the first framework to leverage Lorentzian hyperbolic space for protein–ligand representation learning. Exploiting its exponential geometric properties, HypSeek jointly models molecular structural relationships and binding affinities, enabling unified optimization for virtual screening and affinity ranking. Our method employs a protein-guided triple-tower architecture that co-embeds ligands, protein binding pockets, and protein sequences within hyperbolic space. On DUD-E, HypSeek achieves a 20.7% improvement in early enrichment rate; on the JACS dataset, it attains a 25.4% gain in Spearman correlation for affinity ranking—substantially outperforming Euclidean baselines, particularly in “activity cliff” scenarios where small structural changes induce large affinity shifts.
📝 Abstract
Protein-ligand binding prediction is central to virtual screening and affinity ranking, two fundamental tasks in drug discovery. While recent retrieval-based methods embed ligands and protein pockets into Euclidean space for similarity-based search, the geometry of Euclidean embeddings often fails to capture the hierarchical structure and fine-grained affinity variations intrinsic to molecular interactions. In this work, we propose HypSeek, a hyperbolic representation learning framework that embeds ligands, protein pockets, and sequences into Lorentz-model hyperbolic space. By leveraging the exponential geometry and negative curvature of hyperbolic space, HypSeek enables expressive, affinity-sensitive embeddings that can effectively model both global activity and subtle functional differences-particularly in challenging cases such as activity cliffs, where structurally similar ligands exhibit large affinity gaps. Our mode unifies virtual screening and affinity ranking in a single framework, introducing a protein-guided three-tower architecture to enhance representational structure. HypSeek improves early enrichment in virtual screening on DUD-E from 42.63 to 51.44 (+20.7%) and affinity ranking correlation on JACS from 0.5774 to 0.7239 (+25.4%), demonstrating the benefits of hyperbolic geometry across both tasks and highlighting its potential as a powerful inductive bias for protein-ligand modeling.