🤖 AI Summary
Synthetic polymer design has long been hindered by inefficient 3D conformational modeling and a lack of generative methods that respect real-world conformational diversity. To address this, we propose the first latent-variable diffusion generative model tailored for synthetic polymers. Our method introduces polymer connectivity-aware molecular graph encoding and conformation-enhanced sampling, enabling diverse, chemically plausible 3D structures—both linear and branched—to be generated solely from monomer chemical formulas. The model is trained jointly on DFT-optimized small-molecule and polymer datasets and evaluated using a novel, first-of-its-kind polymer structure matching benchmark. Validated on 3,855 DFT-optimized polymer conformations, our approach significantly improves both conformational validity and diversity. It establishes the first scalable, physics-informed generative paradigm for highly flexible polymers, bridging a critical gap between computational polymer design and realistic structural modeling.
📝 Abstract
Synthetic polymeric materials underpin fundamental technologies in the energy, electronics, consumer goods, and medical sectors, yet their development still suffers from prolonged design timelines. Although polymer informatics tools have supported speedup, polymer simulation protocols continue to face significant challenges: on-demand generation of realistic 3D atomic structures that respect the conformational diversity of polymer structures. Generative algorithms for 3D structures of inorganic crystals, bio-polymers, and small molecules exist, but have not addressed synthetic polymers. In this work, we introduce polyGen, the first latent diffusion model designed specifically to generate realistic polymer structures from minimal inputs such as the repeat unit chemistry alone, leveraging a molecular encoding that captures polymer connectivity throughout the architecture. Due to a scarce dataset of only 3855 DFT-optimized polymer structures, we augment our training with DFT-optimized molecular structures, showing improvement in joint learning between similar chemical structures. We also establish structure matching criteria to benchmark our approach on this novel problem. polyGen effectively generates diverse conformations of both linear chains and complex branched structures, though its performance decreases when handling repeat units with a high atom count. Given these initial results, polyGen represents a paradigm shift in atomic-level structure generation for polymer science-the first proof-of-concept for predicting realistic atomic-level polymer conformations while accounting for their intrinsic structural flexibility.