Decoupled Sequence and Structure Generation for Realistic Antibody Design

📅 2024-02-08
📈 Citations: 0
Influential: 0
📄 PDF

career value

187K/year
🤖 AI Summary
Existing antibody sequence–structure co-generation models suffer from inaccurate 3D coordinate modeling, distorted amino acid type distributions, and high sequence repetition rates—posing potential immunogenicity risks. To address these limitations, we propose the Antibody Sequence–Structure Decoupling (ASSD) framework, the first to decouple sequence generation and structure prediction into two independent tasks, thereby overcoming architectural and performance bottlenecks inherent in joint modeling. Methodologically, we design an amino acid composition-constrained objective that significantly suppresses repetitive token generation in non-autoregressive sequence modeling, and introduce a synergistic optimization mechanism balancing structural fidelity and developability. Experiments demonstrate that ASSD achieves state-of-the-art performance across multiple antibody design benchmarks: repetitive token rate decreases by 42%, generated sequences better approximate natural antibody distributions, and backbone RMSD improves by 18%. These advances collectively enhance clinical safety and drug developability.

Technology Category

Application Category

📝 Abstract
Recently, deep learning has made rapid progress in antibody design, which plays a key role in the advancement of therapeutics. A dominant paradigm is to train a model to jointly generate the antibody sequence and the structure as a candidate. However, the joint generation requires the model to generate both the discrete amino acid categories and the continuous 3D coordinates; this limits the space of possible architectures and may lead to suboptimal performance. In response, we propose an antibody sequence-structure decoupling (ASSD) framework, which separates sequence generation and structure prediction. Although our approach is simple, our idea allows the use of powerful neural architectures and demonstrates notable performance improvements. We also find that the widely used non-autoregressive generators promote sequences with overly repeating tokens. Such sequences are both out-of-distribution and prone to undesirable developability properties that can trigger harmful immune responses in patients. To resolve this, we introduce a composition-based objective that allows an efficient trade-off between high performance and low token repetition. ASSD shows improved performance in various antibody design experiments, while the composition-based objective successfully mitigates token repetition of non-autoregressive models.
Problem

Research questions and friction points this paper is trying to address.

Antibody sequence generation
3D structure prediction
Deep learning limitations
Innovation

Methods, ideas, or system contributions that make the work stand out.

Antibody Sequence-Structure Decoupling (ASSD)
Neural Network Technology
Sequence Redundancy Reduction
🔎 Similar Papers
No similar papers found.