🤖 AI Summary
Existing polymer representation methods predominantly rely on monomer-level descriptors, neglecting global conformational structural information and lacking a general-purpose foundation model supporting diverse downstream tasks.
Method: We propose PolyConFM—the first conformation-centric polymer foundation model—unifying polymer modeling and design via generative pretraining. It innovatively incorporates local conformational sequences and orientation transformations into a conditional generation framework, leveraging molecular dynamics simulations, masked autoregressive modeling, and local conformation decomposition to construct a high-quality conformational dataset.
Contribution/Results: Extensive experiments demonstrate that PolyConFM significantly outperforms task-specific methods across multiple downstream applications—including property prediction and inverse design—exhibiting strong generalization capability and cross-task versatility. PolyConFM establishes a new paradigm for intelligent polymer discovery and development.
📝 Abstract
Polymers, macromolecules formed from covalently bonded monomers, underpin countless technologies and are indispensable to modern life. While deep learning is advancing polymer science, existing methods typically represent the whole polymer solely through monomer-level descriptors, overlooking the global structural information inherent in polymer conformations, which ultimately limits their practical performance. Moreover, this field still lacks a universal foundation model that can effectively support diverse downstream tasks, thereby severely constraining progress. To address these challenges, we introduce PolyConFM, the first polymer foundation model that unifies polymer modeling and design through conformation-centric generative pretraining. Recognizing that each polymer conformation can be decomposed into a sequence of local conformations (i.e., those of its repeating units), we pretrain PolyConFM under the conditional generation paradigm, reconstructing these local conformations via masked autoregressive (MAR) modeling and further generating their orientation transformations to recover the corresponding polymer conformation. Besides, we construct the first high-quality polymer conformation dataset via molecular dynamics simulations to mitigate data sparsity, thereby enabling conformation-centric pretraining. Experiments demonstrate that PolyConFM consistently outperforms representative task-specific methods on diverse downstream tasks, equipping polymer science with a universal and powerful tool.