UniSite: The First Cross-Structure Dataset and Learning Framework for End-to-End Ligand Binding Site Detection

📅 2025-06-03
📈 Citations: 0
Influential: 0
📄 PDF

career value

181K/year
🤖 AI Summary
To address three key challenges in protein–ligand binding site detection—structural heterogeneity across complexes, fragmented processing pipelines, and inaccurate evaluation metrics—this paper introduces UniSite-DS, the first UniProt-centric multi-binding-site dataset, scaling 4.81× larger in multi-site instances and 2.08× larger in total instances than prior largest datasets. We propose an end-to-end binding site detection framework that unifies segmentation and clustering via set-based prediction modeling and bipartite matching using the Hungarian algorithm. Furthermore, we adopt IoU-weighted mean average precision (mAP) as a more robust evaluation metric. Experiments demonstrate that UniSite consistently outperforms state-of-the-art methods across multiple benchmarks, significantly mitigating statistical bias while improving detection consistency and evaluation reliability.

Technology Category

Application Category

📝 Abstract
The detection of ligand binding sites for proteins is a fundamental step in Structure-Based Drug Design. Despite notable advances in recent years, existing methods, datasets, and evaluation metrics are confronted with several key challenges: (1) current datasets and methods are centered on individual protein-ligand complexes and neglect that diverse binding sites may exist across multiple complexes of the same protein, introducing significant statistical bias; (2) ligand binding site detection is typically modeled as a discontinuous workflow, employing binary segmentation and subsequent clustering algorithms; (3) traditional evaluation metrics do not adequately reflect the actual performance of different binding site prediction methods. To address these issues, we first introduce UniSite-DS, the first UniProt (Unique Protein)-centric ligand binding site dataset, which contains 4.81 times more multi-site data and 2.08 times more overall data compared to the previously most widely used datasets. We then propose UniSite, the first end-to-end ligand binding site detection framework supervised by set prediction loss with bijective matching. In addition, we introduce Average Precision based on Intersection over Union (IoU) as a more accurate evaluation metric for ligand binding site prediction. Extensive experiments on UniSite-DS and several representative benchmark datasets demonstrate that IoU-based Average Precision provides a more accurate reflection of prediction quality, and that UniSite outperforms current state-of-the-art methods in ligand binding site detection. The dataset and codes will be made publicly available at https://github.com/quanlin-wu/unisite.
Problem

Research questions and friction points this paper is trying to address.

Detects diverse binding sites across multiple protein complexes
Replaces discontinuous workflow with end-to-end detection framework
Introduces improved evaluation metric for binding site prediction
Innovation

Methods, ideas, or system contributions that make the work stand out.

First UniProt-centric ligand binding site dataset
End-to-end detection framework with set prediction
IoU-based Average Precision for accurate evaluation
💼 Related Jobs
Postdoctoral Fellow – AI-Driven Multi-Omics Integration for Predictive Toxicology
Pfizer
The annual base salary for this position ranges from $64,600.00 to $107,600.00. In addition, this position is eligible for participation in Pfizer’s Global Performance Plan with a bonus target of 7.5% of the base salary. We offer comprehensive and generous benefits and programs to help our colleagues lead healthy lives and to support each of life’s moments. Benefits offered include a 401(k) plan with Pfizer Matching Contributions and an additional Pfizer Retirement Savings Contribution, paid vacation, holiday and personal days, paid caregiver/parental and medical leave, and health benefits to include medical, prescription drug, dental and vision coverage. Learn more at Pfizer Candidate Site – U.S. Benefits | (uscandidates.mypfizerbenefits.com). Pfizer compensation structures and benefit packages are aligned based on the location of hire. The United States salary range provided does not apply to Tampa, FL or any location outside of the United States. Relocation assistance may be available based on business needs and/or eligibility.
Hybrid