Modeling All-Atom Glycan Structures via Hierarchical Message Passing and Multi-Scale Pre-training

πŸ“… 2025-06-02
πŸ“ˆ Citations: 0
✨ Influential: 0
πŸ“„ PDF
πŸ€– AI Summary
Current glycan modeling predominantly relies on monosaccharide-level graphs, neglecting atomic-level details critical to glycan function and physicochemical properties. To address this limitation, we propose GlycanAAβ€”the first heterogeneous graph neural network for full-atom glycan modeling. GlycanAA unifies monosaccharide units and individual atoms as heterogeneous graph nodes and introduces a hierarchical message-passing mechanism that jointly captures local atomic interactions and global glycosidic backbone topology. Furthermore, we design a multi-scale masked prediction pretraining strategy that concurrently models dependencies at the atom, bond, and monosaccharide levels. Extensive experiments demonstrate that GlycanAA significantly outperforms state-of-the-art methods across multiple glycan property prediction tasks, including aqueous solubility and protein-binding affinity. An enhanced variant, PreGlycanAA, achieves further performance gains. The code and datasets are publicly available.

Technology Category

Application Category

πŸ“ Abstract
Understanding the various properties of glycans with machine learning has shown some preliminary promise. However, previous methods mainly focused on modeling the backbone structure of glycans as graphs of monosaccharides (i.e., sugar units), while they neglected the atomic structures underlying each monosaccharide, which are actually important indicators of glycan properties. We fill this blank by introducing the GlycanAA model for All-Atom-wise Glycan modeling. GlycanAA models a glycan as a heterogeneous graph with monosaccharide nodes representing its global backbone structure and atom nodes representing its local atomic-level structures. Based on such a graph, GlycanAA performs hierarchical message passing to capture from local atomic-level interactions to global monosaccharide-level interactions. To further enhance model capability, we pre-train GlycanAA on a high-quality unlabeled glycan dataset, deriving the PreGlycanAA model. We design a multi-scale mask prediction algorithm to endow the model about different levels of dependencies in a glycan. Extensive benchmark results show the superiority of GlycanAA over existing glycan encoders and verify the further improvements achieved by PreGlycanAA. We maintain all resources at https://github.com/kasawa1234/GlycanAA
Problem

Research questions and friction points this paper is trying to address.

Modeling all-atom glycan structures via hierarchical message passing
Capturing atomic-level interactions neglected by previous methods
Enhancing model capability with multi-scale pre-training
Innovation

Methods, ideas, or system contributions that make the work stand out.

Hierarchical message passing for multi-level interactions
Multi-scale pre-training with mask prediction
Heterogeneous graph modeling glycan atomic structures
πŸ”Ž Similar Papers