Learning Repetition-Invariant Representations for Polymer Informatics

📅 2025-05-15

📈 Citations: 0

✨ Influential: 0

career value

196K/year

🤖 AI Summary

Existing graph neural networks (GNNs) for polymer modeling suffer from inconsistent representations due to variable repeat unit counts, hindering generalization to unseen chain lengths. To address this, we propose the Graph Repetition Invariance (GRIN) framework—the first to provide rigorous theoretical guarantees of repetition invariance from both architectural design and data augmentation perspectives, proving that tri-unit augmentation constitutes a tight lower bound. Methodologically, GRIN integrates graph maximum spanning tree alignment with controllable repeat-unit augmentation to construct structural-aware, chain-length-agnostic representations within GNNs. Evaluated on homopolymer and copolymer benchmarks, GRIN significantly outperforms state-of-the-art methods, achieving strong generalization to unseen chain lengths and high representation stability. This work establishes the first theoretically grounded invariant representation paradigm for polymer informatics.

Technology Category

Application Category

📝 Abstract

Polymers are large macromolecules composed of repeating structural units known as monomers and are widely applied in fields such as energy storage, construction, medicine, and aerospace. However, existing graph neural network methods, though effective for small molecules, only model the single unit of polymers and fail to produce consistent vector representations for the true polymer structure with varying numbers of units. To address this challenge, we introduce Graph Repetition Invariance (GRIN), a novel method to learn polymer representations that are invariant to the number of repeating units in their graph representations. GRIN integrates a graph-based maximum spanning tree alignment with repeat-unit augmentation to ensure structural consistency. We provide theoretical guarantees for repetition-invariance from both model and data perspectives, demonstrating that three repeating units are the minimal augmentation required for optimal invariant representation learning. GRIN outperforms state-of-the-art baselines on both homopolymer and copolymer benchmarks, learning stable, repetition-invariant representations that generalize effectively to polymer chains of unseen sizes.

Problem

Research questions and friction points this paper is trying to address.

Existing methods fail to model polymer structures with varying repeating units

GRIN learns polymer representations invariant to repeating unit counts

Ensures structural consistency via graph-based alignment and augmentation

Innovation

Methods, ideas, or system contributions that make the work stand out.

GRIN ensures polymer representation invariance

Uses graph-based maximum spanning tree alignment

Requires minimal three repeating units

🔎 Similar Papers

PolygonGNN: Representation Learning for Polygonal Geometries with Heterogeneous Visibility Graph