🤖 AI Summary
This study systematically compares the heterogeneous challenges in generative modeling of small-molecule and therapeutic peptide drugs using diffusion models: small molecules prioritize synthetic accessibility, whereas peptides require concurrent optimization of biostability, correct folding, and low immunogenicity; both domains suffer from inaccurate scoring functions, scarcity of high-quality experimental data, and absence of experimental validation loops. Method: We propose a unified iterative denoising framework, customized with graph-based representations for small molecules and joint sequence-structure embeddings for peptides, integrated with physicochemical property optimization, stability modeling, and immunogenicity assessment. Results: Experiments demonstrate significant improvements in molecular diversity and target-binding accuracy; however, critical bottlenecks persist in synthetic feasibility and data quality. Contribution: This work is the first to characterize fundamental design disparities between these drug modalities under a unified diffusion paradigm and establishes an experimentally grounded closed-loop optimization pathway.
📝 Abstract
Diffusion models have emerged as a leading framework in generative modeling, showing significant potential to accelerate and transform the traditionally slow and costly process of drug discovery. This review provides a systematic comparison of their application in designing two principal therapeutic modalities: small molecules and therapeutic peptides. We analyze how a unified framework of iterative denoising is adapted to the distinct molecular representations, chemical spaces, and design objectives of each modality. For small molecules, these models excel at structure-based design, generating novel, pocket-fitting ligands with desired physicochemical properties, yet face the critical hurdle of ensuring chemical synthesizability. Conversely, for therapeutic peptides, the focus shifts to generating functional sequences and designing de novo structures, where the primary challenges are achieving biological stability against proteolysis, ensuring proper folding, and minimizing immunogenicity. Despite these distinct challenges, both domains face shared hurdles: the need for more accurate scoring functions, the scarcity of high-quality experimental data, and the crucial requirement for experimental validation. We conclude that the full potential of diffusion models will be unlocked by bridging these modality-specific gaps and integrating them into automated, closed-loop Design-Build-Test-Learn (DBTL) platforms, thereby shifting the paradigm from chemical exploration to the targeted creation of novel therapeutics.