🤖 AI Summary
This study addresses the challenging problem of *de novo* protein design by proposing an end-to-end generative framework based on denoising diffusion probabilistic models (DDPMs). Methodologically, it integrates 3D coordinate modeling, SE(3)-equivariant neural networks, and a joint sequence–structure generation architecture to enable controllable design of target structures and functions. We present the first systematic survey of diffusion model paradigms in this domain. Empirical evaluation demonstrates that RFDiffusion significantly outperforms RFjoint, hallucination-based methods, and traditional approaches across 25 benchmark tasks, exhibiting superior generalizability and robustness. Experimental validation confirms that generated protein backbones and sequences achieve high structural accuracy and foldability, substantially reducing wet-lab trial-and-error costs. The framework establishes a new paradigm for applications including enzyme engineering and antigen design.
📝 Abstract
The de novo design of proteins refers to creating proteins with specific structures and functions that do not naturally exist. In recent years, the accumulation of high-quality protein structure and sequence data and technological advancements have paved the way for the successful application of generative artificial intelligence (AI) models in protein design. These models have surpassed traditional approaches that rely on fragments and bioinformatics. They have significantly enhanced the success rate of de novo protein design, and reduced experimental costs, leading to breakthroughs in the field. Among various generative AI models, diffusion models have yielded the most promising results in protein design. In the past two to three years, more than ten protein design models based on diffusion models have emerged. Among them, the representative model, RFDiffusion, has demonstrated success rates in 25 protein design tasks that far exceed those of traditional methods, and other AI-based approaches like RFjoint and hallucination. This review will systematically examine the application of diffusion models in generating protein backbones and sequences. We will explore the strengths and limitations of different models, summarize successful cases of protein design using diffusion models, and discuss future development directions.