Towards Multiscale Graph-based Protein Learning with Geometric Secondary Structural Motifs

📅 2026-01-31

📈 Citations: 0

✨ Influential: 0

career value

198K/year

🤖 AI Summary

Existing graph neural networks struggle to efficiently capture multiscale features and long-range dependencies in protein structure modeling. To address this limitation, this work proposes a hierarchical graph modeling framework grounded in geometric secondary structure motifs. The approach constructs both fine-grained subgraphs of individual motifs and a coarse-grained inter-motif graph, employing a two-stage GNN architecture to separately learn local conformations and global topology while integrating spatial arrangement and relative orientation information. This design enables modular selection of GNN backbones, achieving substantial gains in both computational efficiency and predictive accuracy without compromising expressiveness. Extensive experiments across multiple benchmark tasks demonstrate superior performance and reduced computational overhead, confirming the method’s effectiveness and generalizability.

Technology Category

Application Category

📝 Abstract

Graph neural networks (GNNs) have emerged as powerful tools for learning protein structures by capturing spatial relationships at the residue level. However, existing GNN-based methods often face challenges in learning multiscale representations and modeling long-range dependencies efficiently. In this work, we propose an efficient multiscale graph-based learning framework tailored to proteins. Our proposed framework contains two crucial components: (1) It constructs a hierarchical graph representation comprising a collection of fine-grained subgraphs, each corresponding to a secondary structure motif (e.g., $\alpha$-helices, $\beta$-strands, loops), and a single coarse-grained graph that connects these motifs based on their spatial arrangement and relative orientation. (2) It employs two GNNs for feature learning: the first operates within individual secondary motifs to capture local interactions, and the second models higher-level structural relationships across motifs. Our modular framework allows a flexible choice of GNN in each stage. Theoretically, we show that our hierarchical framework preserves the desired maximal expressiveness, ensuring no loss of critical structural information. Empirically, we demonstrate that integrating baseline GNNs into our multiscale framework remarkably improves prediction accuracy and reduces computational cost across various benchmarks.

Problem

Research questions and friction points this paper is trying to address.

multiscale representation

long-range dependencies

graph neural networks

protein structure learning

secondary structural motifs

Innovation

Methods, ideas, or system contributions that make the work stand out.

multiscale graph learning

geometric secondary structural motifs

hierarchical graph representation