ViroBench: Benchmarking Nucleotide Foundation Models on Viral Genomics Tasks

📅 2026-05-24
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This study addresses the absence of standardized benchmarks for evaluating the biological understanding and biosafety risks of nucleotide foundation models in viral genomics. The authors propose ViroBench, the first large-scale benchmark specifically designed for viral genomes, establishing a comprehensive evaluation framework that integrates biological functionality and safety across four task categories and eighteen scenarios, systematically assessing 66 models. Through extensive experiments—including multi-architecture comparisons, ablation studies, cross-clade and temporal generalization tests, and functional validation of generated sequences—the work reveals that current models exhibit poor out-of-distribution generalization and a disconnect between statistical likelihood and biological function. Notably, taxonomic diversity in pretraining data proves more critical than model scale, with lightweight models trained on diverse data achieving a 67.5% performance gain. All data and code are publicly released to support reproducible research.
📝 Abstract
Nucleotide sequences constitute the fundamental genetic basis of biological systems, rendering viral genomic analysis critical for biomedical advancement. Despite progress in biological foundation models, specifically nucleotide foundation models (NFMs), the field lacks a unified standard for viral genomics to facilitate community development and enforce biosecurity constraints. To address this, we introduce ViroBench, the first comprehensive and large-scale benchmark specifically designed for NFMs in viral settings. ViroBench evaluates models across two critical dimensions: biological understanding and latent biosecurity risk, covering 18 diverse scenarios within 4 task types. Extensive evaluation of 66 NFMs across diverse architectures yields three critical conclusions. Firstly, NFMs exhibit a performance degradation in biological understanding under phylogenetic and temporal shifts, indicating weak extrapolation capabilities. Secondly, generation tasks reveal a decoupling between statistical likelihood and biological functional validity, posing latent biosecurity risks. Thirdly, controlled ablation studies reveal that taxonomic diversity in pretraining data outweighs parameter scale. Specifically, a lightweight baseline trained on diverse data achieves a 67.5% performance gain over its original model. Overall, ViroBench provides interpretable, diagnostic evaluations and a reproducible measurement framework for future research on viral nucleotide foundation models. The datasets and code are publicly available at https://github.com/QIANJINYDX/ViroBench.
Problem

Research questions and friction points this paper is trying to address.

nucleotide foundation models
viral genomics
benchmarking
biosecurity
biological understanding
Innovation

Methods, ideas, or system contributions that make the work stand out.

nucleotide foundation models
viral genomics
biosecurity risk
benchmarking
phylogenetic generalization
🔎 Similar Papers
No similar papers found.
💼 Related Jobs
Postdoctoral Fellow – AI-Driven Multi-Omics Integration for Predictive Toxicology
Pfizer
The annual base salary for this position ranges from $64,600.00 to $107,600.00. In addition, this position is eligible for participation in Pfizer’s Global Performance Plan with a bonus target of 7.5% of the base salary. We offer comprehensive and generous benefits and programs to help our colleagues lead healthy lives and to support each of life’s moments. Benefits offered include a 401(k) plan with Pfizer Matching Contributions and an additional Pfizer Retirement Savings Contribution, paid vacation, holiday and personal days, paid caregiver/parental and medical leave, and health benefits to include medical, prescription drug, dental and vision coverage. Learn more at Pfizer Candidate Site – U.S. Benefits | (uscandidates.mypfizerbenefits.com). Pfizer compensation structures and benefit packages are aligned based on the location of hire. The United States salary range provided does not apply to Tampa, FL or any location outside of the United States. Relocation assistance may be available based on business needs and/or eligibility.
Hybrid
D
Dongxin Ye
Shanghai Innovation Institute, Shanghai, China; University of Electronic Science and Technology of China, Chengdu, China
Fang Hu
Fang Hu
Professor, College of Information Engineering, Hubei University of Chinese Medicine
Complex NetworksMachine LearningData Analysis
H
Han Hu
Shanghai Artificial Intelligence Laboratory, Shanghai, China; Fudan University, Shanghai, China
S
Shu Hu
Institute of Infection and Health, Fudan University, Shanghai, China; Shanghai Sci-Tech Inno Center for Infection & Immunity, Shanghai, China
Yang Tan
Yang Tan
Shanghai Jiao Tong University & Shanghai Innovation Institute
BioinformaticsDeep Learning
W
Wanli Ouyang
Shenzhen Loop Area Institute, Shenzhen, China; Chinese University of Hong Kong, Hong Kong, China
S
Stan Z. Li
Westlake University, Hangzhou, China
J
Jie Cui
Institute of Infection and Health, Fudan University, Shanghai, China; Shanghai Sci-Tech Inno Center for Infection & Immunity, Shanghai, China
Nanqing Dong
Nanqing Dong
Shanghai Artificial Intelligence Laboratory; University of Oxford
Machine LearningComputer VisionOptimizationAI for Science