Deep learning-guided evolutionary optimization for protein design

📅 2026-03-03
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the challenge of efficiently searching for functional protein sequences within the vast and complex sequence space, where the relationship between sequence and function is highly non-linear. To this end, the authors propose BoGA, a novel framework that integrates genetic algorithms with Bayesian optimization by employing the genetic algorithm as a stochastic proposal generator guided by a surrogate model. This hybrid approach synergistically combines evolutionary search with Bayesian optimization to enable data-efficient exploration of high-dimensional sequence spaces. Leveraging deep learning-based surrogate models and iterative evaluation cycles, BoGA demonstrates strong performance in both sequence and structure design tasks. Notably, it successfully designs high-confidence peptide binders targeting pneumolysin, a key virulence factor of Streptococcus pneumoniae, thereby significantly accelerating the discovery of functional proteins.

Technology Category

Application Category

📝 Abstract
Designing novel proteins with desired characteristics remains a significant challenge due to the large sequence space and the complexity of sequence-function relationships. Efficient exploration of this space to identify sequences that meet specific design criteria is crucial for advancing therapeutics and biotechnology. Here, we present BoGA (Bayesian Optimization Genetic Algorithm), a framework that combines evolutionary search with Bayesian optimization to efficiently navigate the sequence space. By integrating a genetic algorithm as a stochastic proposal generator within a surrogate modeling loop, BoGA prioritizes candidates based on prior evaluations and surrogate model predictions, enabling data-efficient optimization. We demonstrate the utility of BoGA through benchmarking on sequence and structure design tasks, followed by its application in designing peptide binders against pneumolysin, a key virulence factor of \textit{Streptococcus pneumoniae}. BoGA accelerates the discovery of high-confidence binders, demonstrating the potential for efficient protein design across diverse objectives. The algorithm is implemented within the BoPep suite and is available under an MIT license at \href{https://github.com/ErikHartman/bopep}{GitHub}.
Problem

Research questions and friction points this paper is trying to address.

protein design
sequence space
sequence-function relationship
peptide binders
evolutionary optimization
Innovation

Methods, ideas, or system contributions that make the work stand out.

Bayesian Optimization
Genetic Algorithm
Protein Design
Surrogate Modeling
Peptide Binder
🔎 Similar Papers
No similar papers found.
E
Erik Hartman
Division of Infection Medicine, Faculty of Medicine, Lund University, Sweden
D
Di Tang
Division of Infection Medicine, Faculty of Medicine, Lund University, Sweden
Johan Malmström
Johan Malmström
Professor Infection medicine, Lund Univeristy
proteomicsmass spectrometrybacterial pathogenesishost pathogen interactions