🤖 AI Summary
Antimicrobial resistance (AMR) poses a growing global health threat, necessitating the rapid discovery of novel, potent antibiotics. To address this, we present the first deep reinforcement learning framework integrating a large-scale protein language model (6.4 billion parameters) with proximal policy optimization (PPO) for end-to-end de novo antimicrobial peptide (AMP) design. Our method unifies supervised fine-tuning, a minimum inhibitory concentration (MIC) classifier, and differentiable physicochemical constraints to close the generation–scoring–multi-objective optimization loop. Unlike conventional approaches, it enables iterative, directed evolutionary optimization. Experimental validation confirms that all designed peptides exhibit nanomolar-level antibacterial activity, 99% demonstrate broad-spectrum efficacy, and the majority exert potent bactericidal effects primarily via bacterial membrane targeting. This work establishes a scalable, interpretable, AI-driven paradigm for antibiotic discovery.
📝 Abstract
Antimicrobial resistance (AMR) is projected to cause up to 10 million deaths annually by 2050, underscoring the urgent need for new antibiotics. Here we present ApexAmphion, a deep-learning framework for de novo design of antibiotics that couples a 6.4-billion-parameter protein language model with reinforcement learning. The model is first fine-tuned on curated peptide data to capture antimicrobial sequence regularities, then optimised with proximal policy optimization against a composite reward that combines predictions from a learned minimum inhibitory concentration (MIC) classifier with differentiable physicochemical objectives. In vitro evaluation of 100 designed peptides showed low MIC values (nanomolar range in some cases) for all candidates (100% hit rate). Moreover, 99 our of 100 compounds exhibited broad-spectrum antimicrobial activity against at least two clinically relevant bacteria. The lead molecules killed bacteria primarily by potently targeting the cytoplasmic membrane. By unifying generation, scoring and multi-objective optimization with deep reinforcement learning in a single pipeline, our approach rapidly produces diverse, potent candidates, offering a scalable route to peptide antibiotics and a platform for iterative steering toward potency and developability within hours.