Structure-Aware Antibody Design with Affinity-Optimized Inverse Folding

📅 2025-12-19
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Antibody affinity optimization is time-consuming and costly, while existing protein language models exhibit limited predictive capability for this task. To address these challenges, we propose SimBinder-IF, a structure-aware antibody sequence inverse folding generative method. Its core innovation lies in the first integration of preference optimization into antibody inverse folding modeling: we freeze the ESM-IF structural encoder and fine-tune only the decoder, incorporating experimental affinity signals as supervision. This strategy reduces parameter count by 18% and substantially enhances affinity-directed generation. On AbBiBench, SimBinder-IF achieves a Spearman correlation of 0.410 (+55% over baseline), zero-shot cross-antigen generalization of 0.294 (+156%), superior Top-10 accuracy, and significantly improved ΔΔG prediction (error reduced from −46.57 to −75.16).

Technology Category

Application Category

📝 Abstract
Motivation: The clinical efficacy of antibody therapeutics critically depends on high-affinity target engagement, yet laboratory affinity-maturation campaigns are slow and costly. In computational settings, most protein language models (PLMs) are not trained to favor high-affinity antibodies, and existing preference optimization approaches introduce substantial computational overhead without clear affinity gains. Therefore, this work proposes SimBinder-IF, which converts the inverse folding model ESM-IF into an antibody sequence generator by freezing its structure encoder and training only its decoder to prefer experimentally stronger binders through preference optimization. Results: On the 11-assay AbBiBench benchmark, SimBinder-IF achieves a 55 percent relative improvement in mean Spearman correlation between log-likelihood scores and experimentally measured binding affinity compared to vanilla ESM-IF (from 0.264 to 0.410). In zero-shot generalization across four unseen antigen-antibody complexes, the correlation improves by 156 percent (from 0.115 to 0.294). SimBinder-IF also outperforms baselines in top-10 precision for ten-fold or greater affinity improvements. A case study redesigning antibody F045-092 for A/California/04/2009 (pdmH1N1) shows that SimBinder-IF proposes variants with substantially lower predicted binding free energy changes than ESM-IF (mean Delta Delta G -75.16 vs -46.57). Notably, SimBinder-IF trains only about 18 percent of the parameters of the full ESM-IF model, highlighting its parameter efficiency for high-affinity antibody generation.
Problem

Research questions and friction points this paper is trying to address.

Designing high-affinity antibodies computationally without expensive lab experiments
Optimizing antibody sequences for stronger binding using structure-aware models
Improving computational efficiency in antibody design while maintaining affinity gains
Innovation

Methods, ideas, or system contributions that make the work stand out.

Freezes structure encoder, trains decoder for affinity optimization
Uses preference optimization to favor stronger binding antibodies
Achieves high parameter efficiency with partial model training
🔎 Similar Papers
No similar papers found.
X
Xinyan Zhao
McWilliams School of Biomedical Informatics, University of Texas Health Science Center at Houston, 77030, Texas, United States
Y
Yi-Ching Tang
McWilliams School of Biomedical Informatics, University of Texas Health Science Center at Houston, 77030, Texas, United States
R
Rivaaj Monsia
Department of Computer Science, The University of Texas at Austin, 78712, Texas, United States
V
Victor J. Cantu
McWilliams School of Biomedical Informatics, University of Texas Health Science Center at Houston, 77030, Texas, United States
A
Ashwin Kumar Ramesh
Texas Therapeutics Institute, Brown Foundation Institute of Molecular Medicine, University of Texas Health Science Center at Houston, 77030, Texas, United States
Xiaozhong Liu
Xiaozhong Liu
School of Informatics and Computing, Indiana University Bloomington
Information RetrievalNatural Language ProcessingDigital LibrarySemantic WebMetadata
Zhiqiang An
Zhiqiang An
Professor of Molecular Medicine, University of Texas Health Science Center at Houston
antibody therapeuticsantibioticscancer biologymicrobial natural products
Xiaoqian Jiang
Xiaoqian Jiang
McWilliams School of Biomedical Informatics, UTHealth
predictive modelinghealthcare privacy
Y
Yejin Kim
McWilliams School of Biomedical Informatics, University of Texas Health Science Center at Houston, 77030, Texas, United States