🤖 AI Summary
Antimicrobial peptide (AMP) discovery faces dual challenges: the vastness of sequence space and the sparsity of active peptides. Existing generative models neglect the decoder-induced embedding manifold structure and rely on Euclidean metrics, leading to geometric distortion and inefficient search. To address this, we propose the first Riemannian-geometry-based navigation framework for AMP design. Our method constructs a κ-stable manifold family to model the intrinsic geometry of the latent space; introduces local geometric-aware second-order Riemannian Brownian motion sampling, tangent-space amino acid enumeration for mutation, and potential-minimizing geodesic search; and integrates Bayesian optimization with property-enhanced geodesic interpolation for directed optimization. Experimentally, our approach identified four high-activity seed peptides and generated 25 novel broad-spectrum AMPs—demonstrating efficacy against multidrug-resistant pathogens—while significantly improving both discovery efficiency and success rate.
📝 Abstract
Antimicrobial peptide discovery is challenged by the astronomical size of peptide space and the relative scarcity of active peptides. Generative models provide continuous latent "maps" of peptide space, but conventionally ignore decoder-induced geometry and rely on flat Euclidean metrics, rendering exploration and optimization distorted and inefficient. Prior manifold-based remedies assume fixed intrinsic dimensionality, which critically fails in practice for peptide data. Here, we introduce PepCompass, a geometry-aware framework for peptide exploration and optimization. At its core, we define a Union of $κ$-Stable Riemannian Manifolds $mathbb{M}^κ$, a family of decoder-induced manifolds that captures local geometry while ensuring computational stability. We propose two local exploration methods: Second-Order Riemannian Brownian Efficient Sampling, which provides a convergent second-order approximation to Riemannian Brownian motion, and Mutation Enumeration in Tangent Space, which reinterprets tangent directions as discrete amino-acid substitutions. Combining these yields Local Enumeration Bayesian Optimization (LE-BO), an efficient algorithm for local activity optimization. Finally, we introduce Potential-minimizing Geodesic Search (PoGS), which interpolates between prototype embeddings along property-enriched geodesics, biasing discovery toward seeds, i.e. peptides with favorable activity. In-vitro validation confirms the effectiveness of PepCompass: PoGS yields four novel seeds, and subsequent optimization with LE-BO discovers 25 highly active peptides with broad-spectrum activity, including against resistant bacterial strains. These results demonstrate that geometry-informed exploration provides a powerful new paradigm for antimicrobial peptide design.