🤖 AI Summary
Accurately predicting protein–ligand binding affinity remains a critical challenge in drug discovery. Existing deep learning methods over-rely on 3D structural features while neglecting prior biological knowledge—such as Gene Ontology (GO) annotations and ligand biochemical properties—leading to limited generalizability and interpretability. To address this, we propose KALP, the first knowledge graph (KG)-enhanced model that jointly integrates GO annotations and ligand physicochemical knowledge. KALP innovatively introduces a joint optimization framework combining KG relation alignment and local cross-attention, enabling global semantic alignment between protein sequences and ligand molecular graphs, as well as fine-grained cross-modal representation learning. The model adopts a dual-objective multi-task architecture that unifies KG embedding learning with cross-modal attention. On both in-domain and cross-domain benchmarks, KALP significantly outperforms state-of-the-art methods. Visualization and ablation studies further validate its biological interpretability and knowledge-aware reasoning capability.
📝 Abstract
Accurate prediction of protein-ligand binding affinity is critical for drug discovery. While recent deep learning approaches have demonstrated promising results, they often rely solely on structural features, overlooking valuable biochemical knowledge associated with binding affinity. To address this limitation, we propose KEPLA, a novel deep learning framework that explicitly integrates prior knowledge from Gene Ontology and ligand properties of proteins and ligands to enhance prediction performance. KEPLA takes protein sequences and ligand molecular graphs as input and optimizes two complementary objectives: (1) aligning global representations with knowledge graph relations to capture domain-specific biochemical insights, and (2) leveraging cross attention between local representations to construct fine-grained joint embeddings for prediction. Experiments on two benchmark datasets across both in-domain and cross-domain scenarios demonstrate that KEPLA consistently outperforms state-of-the-art baselines. Furthermore, interpretability analyses based on knowledge graph relations and cross attention maps provide valuable insights into the underlying predictive mechanisms.