Modeling All-Atom Glycan Structures via Hierarchical Message Passing and Multi-Scale Pre-training

📅 2025-06-02

📈 Citations: 0

✨ Influential: 0

career value

198K/year

🤖 AI Summary

Current glycan modeling predominantly relies on monosaccharide-level graphs, neglecting atomic-level details critical to glycan function and physicochemical properties. To address this limitation, we propose GlycanAA—the first heterogeneous graph neural network for full-atom glycan modeling. GlycanAA unifies monosaccharide units and individual atoms as heterogeneous graph nodes and introduces a hierarchical message-passing mechanism that jointly captures local atomic interactions and global glycosidic backbone topology. Furthermore, we design a multi-scale masked prediction pretraining strategy that concurrently models dependencies at the atom, bond, and monosaccharide levels. Extensive experiments demonstrate that GlycanAA significantly outperforms state-of-the-art methods across multiple glycan property prediction tasks, including aqueous solubility and protein-binding affinity. An enhanced variant, PreGlycanAA, achieves further performance gains. The code and datasets are publicly available.

Technology Category

Application Category

📝 Abstract

Understanding the various properties of glycans with machine learning has shown some preliminary promise. However, previous methods mainly focused on modeling the backbone structure of glycans as graphs of monosaccharides (i.e., sugar units), while they neglected the atomic structures underlying each monosaccharide, which are actually important indicators of glycan properties. We fill this blank by introducing the GlycanAA model for All-Atom-wise Glycan modeling. GlycanAA models a glycan as a heterogeneous graph with monosaccharide nodes representing its global backbone structure and atom nodes representing its local atomic-level structures. Based on such a graph, GlycanAA performs hierarchical message passing to capture from local atomic-level interactions to global monosaccharide-level interactions. To further enhance model capability, we pre-train GlycanAA on a high-quality unlabeled glycan dataset, deriving the PreGlycanAA model. We design a multi-scale mask prediction algorithm to endow the model about different levels of dependencies in a glycan. Extensive benchmark results show the superiority of GlycanAA over existing glycan encoders and verify the further improvements achieved by PreGlycanAA. We maintain all resources at https://github.com/kasawa1234/GlycanAA

Problem

Research questions and friction points this paper is trying to address.

Modeling all-atom glycan structures via hierarchical message passing

Capturing atomic-level interactions neglected by previous methods

Enhancing model capability with multi-scale pre-training

Innovation

Methods, ideas, or system contributions that make the work stand out.

Hierarchical message passing for multi-level interactions

Multi-scale pre-training with mask prediction

Heterogeneous graph modeling glycan atomic structures

🔎 Similar Papers

Higher-Order Message Passing for Glycan Representation Learning