An Iterative Framework for Generative Backmapping of Coarse Grained Proteins

📅 2025-05-23
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Coarse-grained (CG) to all-atom fine-grained (FG) reverse mapping of proteins suffers from low reconstruction accuracy, training instability, and physical distortions. Method: We propose a novel multi-step iterative generative framework that uniquely integrates conditional variational autoencoders (cVAEs) with graph neural networks (GNNs), establishing a physics-guided latent-space optimization mechanism for progressive refinement from minimal CG representations to atomic-resolution structures. Contribution/Results: Theoretically, we introduce a multi-scale iterative reverse-mapping paradigm that jointly enhances modeling fidelity and training robustness, overcoming key bottlenecks in ultra-CG system modeling. Extensive validation across diverse protein topologies demonstrates a 32% reduction in root-mean-square deviation (RMSD), a 2.1× improvement in training efficiency, and strict adherence of generated structures to fundamental physical constraints—including bond lengths and bond angles.

Technology Category

Application Category

📝 Abstract
The techniques of data-driven backmapping from coarse-grained (CG) to fine-grained (FG) representation often struggle with accuracy, unstable training, and physical realism, especially when applied to complex systems such as proteins. In this work, we introduce a novel iterative framework by using conditional Variational Autoencoders and graph-based neural networks, specifically designed to tackle the challenges associated with such large-scale biomolecules. Our method enables stepwise refinement from CG beads to full atomistic details. We outline the theory of iterative generative backmapping and demonstrate via numerical experiments the advantages of multistep schemes by applying them to proteins of vastly different structures with very coarse representations. This multistep approach not only improves the accuracy of reconstructions but also makes the training process more computationally efficient for proteins with ultra-CG representations.
Problem

Research questions and friction points this paper is trying to address.

Improving accuracy in coarse-grained to fine-grained protein backmapping
Addressing unstable training in large-scale biomolecule reconstruction
Enhancing physical realism in ultra-coarse protein representations
Innovation

Methods, ideas, or system contributions that make the work stand out.

Iterative framework using conditional Variational Autoencoders
Graph-based neural networks for large biomolecules
Stepwise refinement from coarse to atomistic details
🔎 Similar Papers
No similar papers found.
G
Georgios Kementzidis
Department of Applied Mathematics and Statistics, Stony Brook University, Stony Brook, NY 11790, USA
E
Erin Wong
Garcia Center for Polymers at Engineered Interfaces, Stony Brook University, Stony Brook, NY 11790, USA
Ruichen Xu
Ruichen Xu
Stony Brook University
Machine LearningDeep LearningSimulated Annealing AlgorithmNeural OperatorPDE Solver
Yuefan Deng
Yuefan Deng
Stony Brook University
Parallel ComputingMolecular DynamicsMonte Carlo MethodsMultiscale Modeling