On fine-tuning Boltz-2 for protein-protein affinity prediction

📅 2025-12-06
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This study addresses the underperformance of structure-based models relative to sequence-based methods in protein–protein interaction (PPI) binding affinity regression. To bridge this gap, we adapt the advanced structural model Boltz-2 for PPI affinity prediction and propose Boltz-2-PPI—a multimodal framework that jointly leverages Boltz-2-derived 3D structural representations and sequence embeddings (e.g., from ESM-2), integrated via transfer learning and feature fusion to achieve cross-modal complementarity. Experiments on TCR3d and PPB-affinity benchmarks show that while pure structural models remain inferior to sequence-based baselines—even with high-resolution structures—their performance improves substantially upon multimodal fusion (ΔRMSE ≤ 0.8 kcal/mol). These results empirically validate the orthogonality and synergy between structural and sequential signals, reveal critical limitations of current structural representations in affinity modeling, and establish a reproducible multimodal paradigm for PPI affinity prediction.

Technology Category

Application Category

📝 Abstract
Accurate prediction of protein-protein binding affinity is vital for understanding molecular interactions and designing therapeutics. We adapt Boltz-2, a state-of-the-art structure-based protein-ligand affinity predictor, for protein-protein affinity regression and evaluate it on two datasets, TCR3d and PPB-affinity. Despite high structural accuracy, Boltz-2-PPI underperforms relative to sequence-based alternatives in both small- and larger-scale data regimes. Combining embeddings from Boltz-2-PPI with sequence-based embeddings yields complementary improvements, particularly for weaker sequence models, suggesting different signals are learned by sequence- and structure-based models. Our results echo known biases associated with training with structural data and suggest that current structure-based representations are not primed for performant affinity prediction.
Problem

Research questions and friction points this paper is trying to address.

Adapting Boltz-2 for protein-protein affinity prediction
Evaluating structure-based versus sequence-based affinity models
Addressing biases in structure-based protein affinity representations
Innovation

Methods, ideas, or system contributions that make the work stand out.

Fine-tuned Boltz-2 for protein-protein affinity regression
Combined structure-based and sequence-based embeddings for complementary improvements
Evaluated model on TCR3d and PPB-affinity datasets
🔎 Similar Papers
No similar papers found.
James King
James King
Google
High Performance ComputingGPGPUDistributed Computing
L
Lewis Cornwall
Synteny, London, UK
A
Andrei Cristian Nica
Synteny, London, UK
J
James Day
Synteny, London, UK
Aaron Sim
Aaron Sim
Synteny, London, UK
Neil Dalchau
Neil Dalchau
Synteny, London, UK
L
Lilly Wollman
Synteny, London, UK
J
Joshua Meyers
Synteny, London, UK