A deep reinforcement learning platform for antibiotic discovery

📅 2025-09-16
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Antimicrobial resistance (AMR) poses a growing global health threat, necessitating the rapid discovery of novel, potent antibiotics. To address this, we present the first deep reinforcement learning framework integrating a large-scale protein language model (6.4 billion parameters) with proximal policy optimization (PPO) for end-to-end de novo antimicrobial peptide (AMP) design. Our method unifies supervised fine-tuning, a minimum inhibitory concentration (MIC) classifier, and differentiable physicochemical constraints to close the generation–scoring–multi-objective optimization loop. Unlike conventional approaches, it enables iterative, directed evolutionary optimization. Experimental validation confirms that all designed peptides exhibit nanomolar-level antibacterial activity, 99% demonstrate broad-spectrum efficacy, and the majority exert potent bactericidal effects primarily via bacterial membrane targeting. This work establishes a scalable, interpretable, AI-driven paradigm for antibiotic discovery.

Technology Category

Application Category

📝 Abstract
Antimicrobial resistance (AMR) is projected to cause up to 10 million deaths annually by 2050, underscoring the urgent need for new antibiotics. Here we present ApexAmphion, a deep-learning framework for de novo design of antibiotics that couples a 6.4-billion-parameter protein language model with reinforcement learning. The model is first fine-tuned on curated peptide data to capture antimicrobial sequence regularities, then optimised with proximal policy optimization against a composite reward that combines predictions from a learned minimum inhibitory concentration (MIC) classifier with differentiable physicochemical objectives. In vitro evaluation of 100 designed peptides showed low MIC values (nanomolar range in some cases) for all candidates (100% hit rate). Moreover, 99 our of 100 compounds exhibited broad-spectrum antimicrobial activity against at least two clinically relevant bacteria. The lead molecules killed bacteria primarily by potently targeting the cytoplasmic membrane. By unifying generation, scoring and multi-objective optimization with deep reinforcement learning in a single pipeline, our approach rapidly produces diverse, potent candidates, offering a scalable route to peptide antibiotics and a platform for iterative steering toward potency and developability within hours.
Problem

Research questions and friction points this paper is trying to address.

Developing new antibiotics to combat antimicrobial resistance causing millions of deaths
Creating a deep learning framework for de novo design of peptide antibiotics
Optimizing antibiotic candidates for potency and broad-spectrum activity efficiently
Innovation

Methods, ideas, or system contributions that make the work stand out.

Deep reinforcement learning platform for antibiotic discovery
Protein language model coupled with reinforcement learning
Multi-objective optimization combining MIC predictions and physicochemical objectives
🔎 Similar Papers
No similar papers found.
Hanqun Cao
Hanqun Cao
The Chinese University of Hong Kong
Generative ModelingAI4Science
M
Marcelo D. T. Torres
Machine Biology Group, Departments of Psychiatry and Microbiology, Institute for Biomedical Informatics, Institute for Translational Medicine and Therapeutics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, Pennsylvania, United States of America.
J
Jingjie Zhang
Department of Computer Science and Engineering, The Chinese University of Hong Kong, Hong Kong, China
Z
Zijun Gao
Department of Computer Science and Engineering, The Chinese University of Hong Kong, Hong Kong, China
F
Fang Wu
Department of Computer Science, Stanford University, Stanford, California, United States of America
C
Chunbin Gu
Department of Computer Science and Engineering, The Chinese University of Hong Kong, Hong Kong, China
Jure Leskovec
Jure Leskovec
Professor of Computer Science, Stanford University
Data miningMachine LearningGraph Neural NetworksKnowledge GraphsComplex Networks
Yejin Choi
Yejin Choi
Stanford University / NVIDIA
Natural Language ProcessingDeep LearningArtificial IntelligenceCommonsense Reasoning
C
Cesar de la Fuente-Nunez
Machine Biology Group, Departments of Psychiatry and Microbiology, Institute for Biomedical Informatics, Institute for Translational Medicine and Therapeutics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, Pennsylvania, United States of America.
G
Guangyong Chen
Hangzhou Institute of Medicine, Chinese Academy of Sciences, Zhejiang, China
P
Pheng-Ann Heng
Department of Computer Science and Engineering, The Chinese University of Hong Kong, Hong Kong, China