Pan-infection Foundation Framework Enables Multiple Pathogen Prediction

📅 2024-12-31
📈 Citations: 0
Influential: 0
📄 PDF

career value

174K/year
🤖 AI Summary
Current infection diagnostic models suffer from limited sample sizes and coarse pathogen categorization, hindering accurate discrimination between bacterial and viral infections or identification of specific pathogens. To address this, we construct the first large-scale pan-infectious host transcriptomic dataset—comprising 11,247 samples across 89 independent cohorts—and propose a novel “pan-infectious foundation model + knowledge distillation” paradigm. First, we train a high-accuracy foundational diagnostic model (AUC = 0.97); then, via knowledge distillation, we derive lightweight, pathogen-specific models achieving exceptional performance: *Staphylococcus aureus* (AUC = 0.99), *Streptococcus* spp. (0.94), HIV (0.93), RSV (0.94), and sepsis (0.99). This framework enables, for the first time, transferable representation learning across diverse pathogens and clinical conditions (e.g., from pan-infectious states to sepsis), supporting clinically deployable, high-accuracy, resource-efficient diagnostics.

Technology Category

Application Category

📝 Abstract
Host-response-based diagnostics can improve the accuracy of diagnosing bacterial and viral infections, thereby reducing inappropriate antibiotic prescriptions. However, the existing cohorts with limited sample size and coarse infections types are unable to support the exploration of an accurate and generalizable diagnostic model. Here, we curate the largest infection host-response transcriptome data, including 11,247 samples across 89 blood transcriptome datasets from 13 countries and 21 platforms. We build a diagnostic model for pathogen prediction starting from a pan-infection model as foundation (AUC = 0.97) based on the pan-infection dataset. Then, we utilize knowledge distillation to efficiently transfer the insights from this"teacher"model to four lightweight pathogen"student"models, i.e., staphylococcal infection (AUC = 0.99), streptococcal infection (AUC = 0.94), HIV infection (AUC = 0.93), and RSV infection (AUC = 0.94), as well as a sepsis"student"model (AUC = 0.99). The proposed knowledge distillation framework not only facilitates the diagnosis of pathogens using pan-infection data, but also enables an across-disease study from pan-infection to sepsis. Moreover, the framework enables high-degree lightweight design of diagnostic models, which is expected to be adaptively deployed in clinical settings.
Problem

Research questions and friction points this paper is trying to address.

Infectious Disease Diagnosis
Bacterial vs Viral Infections
Specific Pathogen Identification
Innovation

Methods, ideas, or system contributions that make the work stand out.

Knowledge Distillation
Pathogen Prediction Model
Clinical Application Enhancement
🔎 Similar Papers
No similar papers found.
💼 Related Jobs
Postdoctoral Fellow – AI-Driven Multi-Omics Integration for Predictive Toxicology
Pfizer
The annual base salary for this position ranges from $64,600.00 to $107,600.00. In addition, this position is eligible for participation in Pfizer’s Global Performance Plan with a bonus target of 7.5% of the base salary. We offer comprehensive and generous benefits and programs to help our colleagues lead healthy lives and to support each of life’s moments. Benefits offered include a 401(k) plan with Pfizer Matching Contributions and an additional Pfizer Retirement Savings Contribution, paid vacation, holiday and personal days, paid caregiver/parental and medical leave, and health benefits to include medical, prescription drug, dental and vision coverage. Learn more at Pfizer Candidate Site – U.S. Benefits | (uscandidates.mypfizerbenefits.com). Pfizer compensation structures and benefit packages are aligned based on the location of hire. The United States salary range provided does not apply to Tampa, FL or any location outside of the United States. Relocation assistance may be available based on business needs and/or eligibility.
Hybrid
L
Lingrui Zhang
Shenzhen Key Laboratory of Robotics Perception and Intelligence, and the Department of Electronic and Electrical Engineering, Southern University of Science and Technology, Shenzhen, China
H
Haonan Wu
Department of Critical Care Medicine, Shenzhen People’s Hospital, The First Affiliated Hospital of Southern University of Science and Technology, Shenzhen 518020, China
N
Nana Jin
Department of Critical Care Medicine, Shenzhen People’s Hospital, The First Affiliated Hospital of Southern University of Science and Technology, Shenzhen 518020, China
Chenqing Zheng
Chenqing Zheng
Department of Critical Care Medicine, Shenzhen People’s Hospital, The First Affiliated Hospital of Southern University of Science and Technology, Shenzhen 518020, China
Jize Xie
Jize Xie
The Hong Kong University of Science and Technology
Operations researchOnline learning
Q
Qitai Cai
Department of Critical Care Medicine, Shenzhen People’s Hospital, The First Affiliated Hospital of Southern University of Science and Technology, Shenzhen 518020, China
J
Jun Wang
Bioinformatics Centre, Department of Biology, University of Copenhagen, København Ø 2100, Denmark
Q
Qin Cao
School of Biomedical Science, The Chinese University of Hong Kong, Hong Kong
Xubin Zheng
Xubin Zheng
Great Bay University
Bioinformatics and Computational Biology
Jiankun Wang
Jiankun Wang
Southern University of Science and Technology
RoboticsPath PlanningMotion ControlHuman-Robot Interaction
Lixin Cheng
Lixin Cheng
Shenzhen People's Hospital
Computational Biology and Bioinformatics