Fast and Low-Cost Genomic Foundation Models via Outlier Removal

📅 2025-05-01

📈 Citations: 0

✨ Influential: 0

career value

214K/year

🤖 AI Summary

This work addresses the lack of standardized adversarial robustness evaluation for genomic foundation models (GFMs) by introducing GERM, the first unified adversarial benchmark for GFMs. Methodologically, it systematically evaluates five state-of-the-art GFMs under four adversarial attack algorithms—including PGD and FGSM—and three defense strategies, integrating quantitative robustness analysis, architectural comparison, and training data provenance to establish a reproducible, fine-grained evaluation framework. Key contributions include: (1) the first empirical demonstration that Transformer-based architectures significantly outperform alternatives such as HyenaDNA in adversarial robustness; (2) the discovery that adversarial perturbations consistently target biologically functional regions—particularly promoters and enhancers—providing evidence that GFMs have learned semantically meaningful genomic representations; and (3) the release of a multidimensional vulnerability attribution toolkit enabling disentangled robustness analysis across architectural design, quantization schemes, and training data composition.

Technology Category

Application Category

📝 Abstract

We propose the first unified adversarial attack benchmark for Genomic Foundation Models (GFMs), named GERM. Unlike existing GFM benchmarks, GERM offers the first comprehensive evaluation framework to systematically assess the vulnerability of GFMs to adversarial attacks. Methodologically, we evaluate the adversarial robustness of five state-of-the-art GFMs using four widely adopted attack algorithms and three defense strategies. Importantly, our benchmark provides an accessible and comprehensive framework to analyze GFM vulnerabilities with respect to model architecture, quantization schemes, and training datasets. Empirically, transformer-based models exhibit greater robustness to adversarial perturbations compared to HyenaDNA, highlighting the impact of architectural design on vulnerability. Moreover, adversarial attacks frequently target biologically significant genomic regions, suggesting that these models effectively capture meaningful sequence features.

Problem

Research questions and friction points this paper is trying to address.

Assess vulnerability of Genomic Foundation Models to adversarial attacks

Evaluate robustness of GFMs across architectures and training datasets

Identify adversarial targeting of biologically significant genomic regions

Innovation

Methods, ideas, or system contributions that make the work stand out.

Unified adversarial attack benchmark for GFMs

Evaluates robustness using multiple attack algorithms

Analyzes vulnerabilities in architecture and datasets

🔎 Similar Papers

Strong screening rules for group-based SLOPE models