🤖 AI Summary
This study addresses the challenge of immunoglobulin–antigen (Ig–Ag) binding prediction, which is hindered by the scarcity of experimental structures and limited accuracy in antibody modeling. To overcome these limitations, the authors propose IgPose, a generative data augmentation framework that enriches training data with high-fidelity decoy conformations and integrates geometric and evolutionary features for binding pose identification and scoring. Key innovations include the first Structure-based Immunoglobulin Decoy Database (SIDD), an interface-focused k-hop sampling strategy combined with biologically guided pooling, and a twin-network architecture leveraging equivariant graph neural networks, ESM-2 embeddings, and gated recurrent units for conformation classification and DockQ scoring, respectively. IgPose significantly outperforms existing physics-based and deep learning methods on both an internal test set and the CASP-16 benchmark, offering an efficient and accurate tool for high-throughput antibody discovery.
📝 Abstract
Predicting immunoglobulin-antigen (Ig-Ag) binding remains a significant challenge due to the paucity of experimentally-resolved complexes and the limited accuracy of de novo Ig structure prediction. We introduce IgPose, a generalizable framework for Ig-Ag pose identification and scoring, built on a generative data-augmentation pipeline. To mitigate data scarcity, we constructed the Structural Immunoglobulin Decoy Database (SIDD), a comprehensive repository of high-fidelity synthetic decoys. IgPose integrates equivariant graph neural networks, ESM-2 embeddings, and gated recurrent units to synergistically capture both geometric and evolutionary features. We implemented interface-focused k-hop sampling with biologically guided pooling to enhance generalization across diverse interfaces. The framework comprises two sub-networks--IgPoseClassifier for binding pose discrimination and IgPoseScore for DockQ score estimation--and achieves robust performance on curated internal test sets and the CASP-16 benchmark compared to physics and deep learning baselines. IgPose serves as a versatile computational tool for high-throughput antibody discovery pipelines by providing accurate pose filtering and ranking. IgPose is available on GitHub (https://github.com/arontier/igpose).