🤖 AI Summary
Antibody affinity optimization is time-consuming and costly, while existing protein language models exhibit limited predictive capability for this task. To address these challenges, we propose SimBinder-IF, a structure-aware antibody sequence inverse folding generative method. Its core innovation lies in the first integration of preference optimization into antibody inverse folding modeling: we freeze the ESM-IF structural encoder and fine-tune only the decoder, incorporating experimental affinity signals as supervision. This strategy reduces parameter count by 18% and substantially enhances affinity-directed generation. On AbBiBench, SimBinder-IF achieves a Spearman correlation of 0.410 (+55% over baseline), zero-shot cross-antigen generalization of 0.294 (+156%), superior Top-10 accuracy, and significantly improved ΔΔG prediction (error reduced from −46.57 to −75.16).
📝 Abstract
Motivation: The clinical efficacy of antibody therapeutics critically depends on high-affinity target engagement, yet laboratory affinity-maturation campaigns are slow and costly. In computational settings, most protein language models (PLMs) are not trained to favor high-affinity antibodies, and existing preference optimization approaches introduce substantial computational overhead without clear affinity gains. Therefore, this work proposes SimBinder-IF, which converts the inverse folding model ESM-IF into an antibody sequence generator by freezing its structure encoder and training only its decoder to prefer experimentally stronger binders through preference optimization.
Results: On the 11-assay AbBiBench benchmark, SimBinder-IF achieves a 55 percent relative improvement in mean Spearman correlation between log-likelihood scores and experimentally measured binding affinity compared to vanilla ESM-IF (from 0.264 to 0.410). In zero-shot generalization across four unseen antigen-antibody complexes, the correlation improves by 156 percent (from 0.115 to 0.294). SimBinder-IF also outperforms baselines in top-10 precision for ten-fold or greater affinity improvements. A case study redesigning antibody F045-092 for A/California/04/2009 (pdmH1N1) shows that SimBinder-IF proposes variants with substantially lower predicted binding free energy changes than ESM-IF (mean Delta Delta G -75.16 vs -46.57). Notably, SimBinder-IF trains only about 18 percent of the parameters of the full ESM-IF model, highlighting its parameter efficiency for high-affinity antibody generation.