Tokenizing Electron Cloud in Protein-Ligand Interaction Learning

📅 2025-05-25
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the limitation of conventional atomic-level structural modeling in protein–ligand binding affinity prediction, which neglects quantum chemical effects. We propose the first end-to-end deep learning framework that explicitly incorporates 3D electron cloud density as a geometric signal. Our method introduces a structure-aware Transformer to encode electron density fields, employs hierarchical vector quantization (VQ) for learnable compression of electron cloud embeddings, and establishes an electron-cloud-agnostic knowledge distillation paradigm to enhance generalizability. Crucially, it is the first approach to model quantum chemical features—specifically, electron clouds—as differentiable, learnable geometric signals, thereby overcoming the representational bottleneck inherent in coordinate-only atomic models. Evaluated on multiple standard benchmarks, our method achieves state-of-the-art performance, improving structure-level Pearson and Spearman correlation coefficients by 6.42% and 15.58%, respectively.

Technology Category

Application Category

📝 Abstract
The affinity and specificity of protein-molecule binding directly impact functional outcomes, uncovering the mechanisms underlying biological regulation and signal transduction. Most deep-learning-based prediction approaches focus on structures of atoms or fragments. However, quantum chemical properties, such as electronic structures, are the key to unveiling interaction patterns but remain largely underexplored. To bridge this gap, we propose ECBind, a method for tokenizing electron cloud signals into quantized embeddings, enabling their integration into downstream tasks such as binding affinity prediction. By incorporating electron densities, ECBind helps uncover binding modes that cannot be fully represented by atom-level models. Specifically, to remove the redundancy inherent in electron cloud signals, a structure-aware transformer and hierarchical codebooks encode 3D binding sites enriched with electron structures into tokens. These tokenized codes are then used for specific tasks with labels. To extend its applicability to a wider range of scenarios, we utilize knowledge distillation to develop an electron-cloud-agnostic prediction model. Experimentally, ECBind demonstrates state-of-the-art performance across multiple tasks, achieving improvements of 6.42% and 15.58% in per-structure Pearson and Spearman correlation coefficients, respectively.
Problem

Research questions and friction points this paper is trying to address.

Predict protein-ligand binding using electron cloud signals
Tokenize electron densities for deep learning integration
Improve binding affinity prediction beyond atom-level models
Innovation

Methods, ideas, or system contributions that make the work stand out.

Tokenizing electron cloud into quantized embeddings
Using structure-aware transformer and hierarchical codebooks
Knowledge distillation for electron-cloud-agnostic model
🔎 Similar Papers
No similar papers found.
H
Haitao Lin
AI Lab, Research Center for Industries of the Future, Westlake University
Odin Zhang
Odin Zhang
UW CS, Institute for Protein Design
AI4ScienceBiomolecule DesignDrug DesignComputer-aided Drug Design
J
Jia Xu
AI Lab, Research Center for Industries of the Future, Westlake University
Y
Yunfan Liu
AI Lab, Research Center for Industries of the Future, Westlake University
Z
Zheng Cheng
DP Technology
Lirong Wu
Lirong Wu
Zhejiang University & Westlake University
Geometric Deep LearningAI4Science
Y
Yufei Huang
AI Lab, Research Center for Industries of the Future, Westlake University
Zhifeng Gao
Zhifeng Gao
DP Technology
Data MiningMachine LearningAI for ScienceAI for Industry
S
Stan Z. Li
AI Lab, Research Center for Industries of the Future, Westlake University