AbRank: A Benchmark Dataset and Metric-Learning Framework for Antibody-Antigen Affinity Ranking

📅 2025-06-21
📈 Citations: 0
Influential: 0
📄 PDF

career value

202K/year
🤖 AI Summary
Antibody–antigen binding affinity prediction is hindered by experimental noise, condition heterogeneity, and poor generalization. To address these challenges, we propose AbRank—a benchmark framework that reformulates affinity prediction as a pairwise ranking task. AbRank integrates over 380,000 heterogeneous experimental measurements and introduces a standardized data split with systematic distribution shifts. We innovatively design an *m*-trustworthy ranking mechanism to filter out samples with negligible affinity differences and establish, for the first time, a rigorous generalization evaluation protocol for both *novel antibodies* and *novel antigens*. Our model, WALLE-Affinity, combines protein language model (PLM) embeddings with 3D structural representations via a graph neural network and employs metric learning to optimize ranking performance. Experiments demonstrate that existing methods suffer significant degradation under realistic generalization settings, whereas AbRank substantially improves model robustness and cross-target transferability—offering a scalable, structure-aware paradigm for antibody drug design.

Technology Category

Application Category

📝 Abstract
Accurate prediction of antibody-antigen (Ab-Ag) binding affinity is essential for therapeutic design and vaccine development, yet the performance of current models is limited by noisy experimental labels, heterogeneous assay conditions, and poor generalization across the vast antibody and antigen sequence space. We introduce AbRank, a large-scale benchmark and evaluation framework that reframes affinity prediction as a pairwise ranking problem. AbRank aggregates over 380,000 binding assays from nine heterogeneous sources, spanning diverse antibodies, antigens, and experimental conditions, and introduces standardized data splits that systematically increase distribution shift, from local perturbations such as point mutations to broad generalization across novel antigens and antibodies. To ensure robust supervision, AbRank defines an m-confident ranking framework by filtering out comparisons with marginal affinity differences, focusing training on pairs with at least an m-fold difference in measured binding strength. As a baseline for the benchmark, we introduce WALLE-Affinity, a graph-based approach that integrates protein language model embeddings with structural information to predict pairwise binding preferences. Our benchmarks reveal significant limitations in current methods under realistic generalization settings and demonstrate that ranking-based training improves robustness and transferability. In summary, AbRank offers a robust foundation for machine learning models to generalize across the antibody-antigen space, with direct relevance for scalable, structure-aware antibody therapeutic design.
Problem

Research questions and friction points this paper is trying to address.

Predict antibody-antigen binding affinity accurately
Overcome noisy labels and poor generalization
Standardize evaluation for diverse antibodies and antigens
Innovation

Methods, ideas, or system contributions that make the work stand out.

Pairwise ranking framework for affinity prediction
Standardized data splits for systematic evaluation
Graph-based model integrating embeddings and structure
💼 Related Jobs
Postdoctoral Fellow – AI-Driven Multi-Omics Integration for Predictive Toxicology
Pfizer
The annual base salary for this position ranges from $64,600.00 to $107,600.00. In addition, this position is eligible for participation in Pfizer’s Global Performance Plan with a bonus target of 7.5% of the base salary. We offer comprehensive and generous benefits and programs to help our colleagues lead healthy lives and to support each of life’s moments. Benefits offered include a 401(k) plan with Pfizer Matching Contributions and an additional Pfizer Retirement Savings Contribution, paid vacation, holiday and personal days, paid caregiver/parental and medical leave, and health benefits to include medical, prescription drug, dental and vision coverage. Learn more at Pfizer Candidate Site – U.S. Benefits | (uscandidates.mypfizerbenefits.com). Pfizer compensation structures and benefit packages are aligned based on the location of hire. The United States salary range provided does not apply to Tampa, FL or any location outside of the United States. Relocation assistance may be available based on business needs and/or eligibility.
Hybrid
C
Chunan Liu
Structural Molecular Biology, Division of Biosciences, University College London, United Kingdom
Aurelien Pelissier
Aurelien Pelissier
IBM Research Zurich, ETH Zurich
AIComputational BiologyPhysics
Y
Yanjun Shao
Biomedical Informatics and Data Science, Yale School of Medicine, United States
L
Lilian Denzler
Structural Molecular Biology, Division of Biosciences, University College London, United Kingdom; Biomedical Informatics and Data Science, Yale School of Medicine, United States
A
Andrew C. R. Martin
Structural Molecular Biology, Division of Biosciences, University College London, United Kingdom
Brooks Paige
Brooks Paige
Associate Professor, University College London
Machine LearningStatistics
M
Mariia Rodriguez Martinez
Biomedical Informatics and Data Science, Yale School of Medicine, United States