General Multimodal Protein Design Enables DNA-Encoding of Chemistry

📅 2026-04-06
📈 Citations: 0
✹ Influential: 0
📄 PDF
đŸ€– AI Summary
Current enzyme design approaches rely on predefined catalytic residues, limiting the expansion of DNA-encodable chemical reaction space. This work proposes DISCO, a multimodal diffusion model that, for the first time, enables joint generation of protein sequences and three-dimensional structures solely from reaction intermediates—without requiring pre-specified catalytic residues—thereby designing multifunctional heme enzymes with novel active-site geometries. The method supports cross-modal, multi-objective optimization during inference and integrates directed evolution for experimental validation. Designed enzymes efficiently catalyze diverse non-natural carbene transfer reactions, including cyclopropanation, B–H insertion, and C(sp³)–H insertion, outperforming existing engineered enzymes in activity, with further enhancements achievable through random mutagenesis.
📝 Abstract
Evolution is an extraordinary engine for enzymatic diversity, yet the chemistry it has explored remains a narrow slice of what DNA can encode. Deep generative models can design new proteins that bind ligands, but none have created enzymes without pre-specifying catalytic residues. We introduce DISCO (DIffusion for Sequence-structure CO-design), a multimodal model that co-designs protein sequence and 3D structure around arbitrary biomolecules, as well as inference-time scaling methods that optimize objectives across both modalities. Conditioned solely on reactive intermediates, DISCO designs diverse heme enzymes with novel active-site geometries. These enzymes catalyze new-to-nature carbene-transfer reactions, including alkene cyclopropanation, spirocyclopropanation, B-H, and C(sp$^3$)-H insertions, with high activities exceeding those of engineered enzymes. Random mutagenesis of a selected design further confirmed that enzyme activity can be improved through directed evolution. By providing a scalable route to evolvable enzymes, DISCO broadens the potential scope of genetically encodable transformations. Code is available at https://github.com/DISCO-design/DISCO.
Problem

Research questions and friction points this paper is trying to address.

protein design
enzyme design
DNA-encoded chemistry
carbene-transfer reactions
de novo enzyme
Innovation

Methods, ideas, or system contributions that make the work stand out.

multimodal protein design
diffusion model
enzyme design
carbene-transfer reactions
sequence-structure co-design
🔎 Similar Papers
No similar papers found.
Jarrid Rector-Brooks
Jarrid Rector-Brooks
Université de Montréal, Mila, Caltech
Generative modelingMachine learningComputer science
T
Théophile Lambert
Université Paris-Saclay
Marta Skreta
Marta Skreta
University of Toronto
Daniel Roth
Daniel Roth
Technical University of Munich
Human-Centered ComputingExtended RealityArtificial IntelligenceRoboticsDigital Health
Y
Yueming Long
California Institute of Technology
Z
Zi-Qi Li
California Institute of Technology
X
Xi Zhang
McGill University
M
Miruna Cretu
University of Cambridge
F
Francesca-Zhoufan Li
California Institute of Technology
T
Tanvi Ganapathy
California Institute of Technology
E
Emily Jin
University of Oxford
A
Avishek Joey Bose
Imperial College London
Jason Yang
Jason Yang
Massachussetts Institute of Technology
algorithmscomplexity theory
Kirill Neklyudov
Kirill Neklyudov
Université de Montréal; Mila - Quebec AI Institute
Yoshua Bengio
Yoshua Bengio
Professor of computer science, University of Montreal, Mila, IVADO, CIFAR
Machine learningdeep learningartificial intelligence
Alexander Tong
Alexander Tong
Aithyra
Flow ModelsDeep LearningOptimal TransportSingle-cellProtein design
F
Frances H. Arnold
California Institute of Technology
Cheng-Hao Liu
Cheng-Hao Liu
Caltech