🤖 AI Summary
Current immunogenicity prediction methods suffer from low accuracy and poor generalizability due to excessive feature compression and overly simplistic model architectures. To address these limitations, we propose VenusVaccine, an end-to-end deep learning framework. First, we introduce a novel dual-path attention mechanism that jointly integrates ESM-derived sequence embeddings with SE(3)-equivariant CNN structural representations. Second, we construct the largest cross-pathogen (bacterial/viral/tumor) immunogenicity benchmark dataset to date, comprising over 7,000 experimentally validated samples. Third, we design a post-hoc interpretability verification protocol to enhance biological plausibility and model transparency. VenusVaccine achieves statistically significant improvements over state-of-the-art methods across multiple evaluation metrics. Crucially, wet-lab validation confirms its capability to identify bona fide vaccine targets in vivo. Both code and data are publicly released to advance standardization in reverse vaccinology.
📝 Abstract
Immunogenicity prediction is a central topic in reverse vaccinology for finding candidate vaccines that can trigger protective immune responses. Existing approaches typically rely on highly compressed features and simple model architectures, leading to limited prediction accuracy and poor generalizability. To address these challenges, we introduce VenusVaccine, a novel deep learning solution with a dual attention mechanism that integrates pre-trained latent vector representations of protein sequences and structures. We also compile the most comprehensive immunogenicity dataset to date, encompassing over 7000 antigen sequences, structures, and immunogenicity labels from bacteria, virus, and tumor. Extensive experiments demonstrate that VenusVaccine outperforms existing methods across a wide range of evaluation metrics. Furthermore, we establish a post-hoc validation protocol to assess the practical significance of deep learning models in tackling vaccine design challenges. Our work provides an effective tool for vaccine design and sets valuable benchmarks for future research. The implementation is at https://github.com/songleee/VenusVaccine.