Predicting and generating antibiotics against future pathogens with ApexOracle

📅 2025-07-10
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Antimicrobial resistance (AMR) is escalating, while the development of novel antibiotics lags critically—necessitating generalizable computational methods for discovering potent antimicrobials against data-scarce, emerging pathogens. To address this, we present the first AI framework that integrates pathogen-specific contextual information: it constructs strain embeddings from dual-source representations (genomic sequences and scientific literature) and couples them with a discrete diffusion language model for molecular representation and de novo generation. Within a unified architecture, the framework enables both activity prediction and generative design against previously unseen pathogens—bypassing reliance on historical antimicrobial activity data. Empirical evaluation demonstrates substantial improvements over state-of-the-art models across multiple bacterial species and chemical spaces. It successfully generates highly active, structurally novel compounds—absent from natural product databases—with validated efficacy against drug-resistant strains. This work establishes a scalable, context-aware computational paradigm for accelerating anti-AMR drug discovery.

Technology Category

Application Category

📝 Abstract
Antimicrobial resistance (AMR) is escalating and outpacing current antibiotic development. Thus, discovering antibiotics effective against emerging pathogens is becoming increasingly critical. However, existing approaches cannot rapidly identify effective molecules against novel pathogens or emerging drug-resistant strains. Here, we introduce ApexOracle, an artificial intelligence (AI) model that both predicts the antibacterial potency of existing compounds and designs de novo molecules active against strains it has never encountered. Departing from models that rely solely on molecular features, ApexOracle incorporates pathogen-specific context through the integration of molecular features captured via a foundational discrete diffusion language model and a dual-embedding framework that combines genomic- and literature-derived strain representations. Across diverse bacterial species and chemical modalities, ApexOracle consistently outperformed state-of-the-art approaches in activity prediction and demonstrated reliable transferability to novel pathogens with little or no antimicrobial data. Its unified representation-generation architecture further enables the in silico creation of "new-to-nature" molecules with high predicted efficacy against priority threats. By pairing rapid activity prediction with targeted molecular generation, ApexOracle offers a scalable strategy for countering AMR and preparing for future infectious-disease outbreaks.
Problem

Research questions and friction points this paper is trying to address.

Predicting antibiotic effectiveness against future pathogens
Designing new molecules for novel drug-resistant strains
Overcoming limitations of current antibiotic discovery methods
Innovation

Methods, ideas, or system contributions that make the work stand out.

AI model predicts and designs antibiotics
Integrates genomic and literature pathogen data
Generates new molecules for novel pathogens
T
Tianang Leng
Machine Biology Group, Departments of Psychiatry and Microbiology, Institute for Biomedical Informatics, Institute for Translational Medicine and Therapeutics, Perelman School of Medicine, University of Pennsylvania; Departments of Bioengineering and Chemical and Biomolecular Engineering, School of Engineering and Applied Science, University of Pennsylvania, Philadelphia, Pennsylvania, United States of America.
F
Fangping Wan
Machine Biology Group, Departments of Psychiatry and Microbiology, Institute for Biomedical Informatics, Institute for Translational Medicine and Therapeutics, Perelman School of Medicine, University of Pennsylvania; Department of Chemistry, School of Arts and Sciences, University of Pennsylvania; Penn Institute for Computational Science, University of Pennsylvania, Philadelphia, Pennsylvania, United States of America.
Marcelo Der Torossian Torres
Marcelo Der Torossian Torres
University of Pennsylvania
Peptide ChemistryAntimicrobial PeptidesPeptide Design
C
Cesar de la Fuente-Nunez
Machine Biology Group, Departments of Psychiatry and Microbiology, Institute for Biomedical Informatics, Institute for Translational Medicine and Therapeutics, Perelman School of Medicine, University of Pennsylvania; Department of Chemistry, School of Arts and Sciences, University of Pennsylvania; Penn Institute for Computational Science, University of Pennsylvania, Philadelphia, Pennsylvania, United States of America.