🤖 AI Summary
Antibody rational design faces challenges in jointly optimizing sequence and structure while balancing binding affinity and stability. Method: We propose a three-stage deep learning framework: (1) pretraining a specificity-aware language model on large-scale antibody sequences; (2) constructing a representation-guided joint sequence–structure diffusion model; and (3) introducing a multi-objective energy function—incorporating electrostatic, van der Waals, and hydrogen-bond terms—to drive Pareto-optimal alignment, establishing the first Pareto optimization paradigm for antibody energy alignment. Technically, we extend AbDPO to multi-objective direct preference optimization and design a temperature-scaled iterative online learning mechanism for seamless integration of heterogeneous data without additional samples. Results: Generated antibodies exhibit native-like structural features and high-affinity antigen binding. Our method significantly improves Pareto front quality across multiple benchmarks, enhancing design stability and functional predictability.
📝 Abstract
We present a three-stage framework for training deep learning models specializing in antibody sequence-structure co-design. We first pre-train a language model using millions of antibody sequence data. Then, we employ the learned representations to guide the training of a diffusion model for joint optimization over both sequence and structure of antibodies. During the final alignment stage, we optimize the model to favor antibodies with low repulsion and high attraction to the antigen binding site, enhancing the rationality and functionality of the designs. To mitigate conflicting energy preferences, we extend AbDPO (Antibody Direct Preference Optimization) to guide the model towards Pareto optimality under multiple energy-based alignment objectives. Furthermore, we adopt an iterative learning paradigm with temperature scaling, enabling the model to benefit from diverse online datasets without requiring additional data. In practice, our proposed methods achieve high stability and efficiency in producing a better Pareto front of antibody designs compared to top samples generated by baselines and previous alignment techniques. Through extensive experiments, we showcase the superior performance of our methods in generating nature-like antibodies with high binding affinity consistently.