🤖 AI Summary
This study systematically evaluates the robustness of genomic foundation models (GFMs) under adversarial attacks and introduces GenoBench—the first unified adversarial benchmark for genomics. Methodologically, it integrates four attack algorithms (FGSM, PGD, CW, and DeepFool) and three defense strategies (input perturbation, feature purification, and robust fine-tuning), conducting reproducible, cross-architectural, cross-quantization, and cross-training-data evaluations on five state-of-the-art GFMs, alongside releasing the dedicated adversarial dataset GenoAdv. Key findings include: (1) classification-based GFMs exhibit significantly higher adversarial robustness than generative GFMs; (2) adversarial perturbations concentrate predominantly in functional genomic regions—particularly promoters and enhancers—demonstrating that GFMs capture biologically critical sequence patterns with high sensitivity; and (3) task type and genomic region specificity are identified as the primary determinants of GFM robustness.
📝 Abstract
We propose the first unified adversarial attack benchmark for Genomic Foundation Models (GFMs), named GenoArmory. Unlike existing GFM benchmarks, GenoArmory offers the first comprehensive evaluation framework to systematically assess the vulnerability of GFMs to adversarial attacks. Methodologically, we evaluate the adversarial robustness of five state-of-the-art GFMs using four widely adopted attack algorithms and three defense strategies. Importantly, our benchmark provides an accessible and comprehensive framework to analyze GFM vulnerabilities with respect to model architecture, quantization schemes, and training datasets. Additionally, we introduce GenoAdv, a new adversarial sample dataset designed to improve GFM safety. Empirically, classification models exhibit greater robustness to adversarial perturbations compared to generative models, highlighting the impact of task type on model vulnerability. Moreover, adversarial attacks frequently target biologically significant genomic regions, suggesting that these models effectively capture meaningful sequence features.