🤖 AI Summary
This work addresses the challenge of predicting binding affinity for multi-domain protein–ligand complexes, which is hindered by the rigid-protein assumption and difficulties in modeling inter-domain dynamics and noise from flexible regions. To overcome these limitations, the authors propose HCLBind, a novel framework that decouples geometric representation learning from affinity regression via self-supervised contrastive learning. HCLBind adopts a “general-to-specific” pretraining paradigm and introduces a hierarchical decoy strategy to separately capture local physicochemical constraints within individual domains and global conformational geometry across multiple domains. The method integrates domain-gated graph attention and cross-modal attention mechanisms to explicitly focus on interfacial regions, while leveraging LoRA adapters on protein and ligand foundation models to efficiently preserve evolutionary information. Evaluated on the PDBBind benchmark, HCLBind significantly outperforms existing approaches, demonstrating its ability to learn discriminative interface features and provide robust uncertainty estimates.
📝 Abstract
Predicting protein-ligand binding affinity remains intractable for multi-domain proteins, where inter-domain dynamics govern molecular recognition. Existing geometric deep learning methods typically treat proteins as monolithic static graphs, suffering from rigid-body assumptions and aleatoric noise in flexible regions. To address this, we introduced HCLBind, a self-supervised framework that decouples geometric representation learning from affinity regression. HCLBind leverages a general-to-specific pre-training paradigm on the Q-BioLiP database to learn a robust physical grammar of binding. We propose a novel hierarchical decoy strategy: the model learns local physicochemical constraints through protein coordinate perturbation in single-domain proteins and global conformational geometry through inter-domain rotation in multi-domain complexes. Our hybrid architecture integrates a domain-gated graph attention network and cross-modal attention to explicitly prioritize domain interfaces. Furthermore, we employ LoRA on protein and ligand foundation models, ensuring efficient optimization while preserving evolutionary knowledge. Experiments on PDBBind demonstrate that HCLBind effectively learns discriminative interface features and provides robust uncertainty estimation, overcoming the limitations of standard supervised learning. The code is available at https://github.com/jiankliu/HCLBind.