π€ AI Summary
Existing graph neural networks struggle to efficiently capture multiscale features and long-range dependencies in protein structure modeling. To address this limitation, this work proposes a hierarchical graph modeling framework grounded in geometric secondary structure motifs. The approach constructs both fine-grained subgraphs of individual motifs and a coarse-grained inter-motif graph, employing a two-stage GNN architecture to separately learn local conformations and global topology while integrating spatial arrangement and relative orientation information. This design enables modular selection of GNN backbones, achieving substantial gains in both computational efficiency and predictive accuracy without compromising expressiveness. Extensive experiments across multiple benchmark tasks demonstrate superior performance and reduced computational overhead, confirming the methodβs effectiveness and generalizability.
π Abstract
Graph neural networks (GNNs) have emerged as powerful tools for learning protein structures by capturing spatial relationships at the residue level. However, existing GNN-based methods often face challenges in learning multiscale representations and modeling long-range dependencies efficiently. In this work, we propose an efficient multiscale graph-based learning framework tailored to proteins. Our proposed framework contains two crucial components: (1) It constructs a hierarchical graph representation comprising a collection of fine-grained subgraphs, each corresponding to a secondary structure motif (e.g., $\alpha$-helices, $\beta$-strands, loops), and a single coarse-grained graph that connects these motifs based on their spatial arrangement and relative orientation. (2) It employs two GNNs for feature learning: the first operates within individual secondary motifs to capture local interactions, and the second models higher-level structural relationships across motifs. Our modular framework allows a flexible choice of GNN in each stage. Theoretically, we show that our hierarchical framework preserves the desired maximal expressiveness, ensuring no loss of critical structural information. Empirically, we demonstrate that integrating baseline GNNs into our multiscale framework remarkably improves prediction accuracy and reduces computational cost across various benchmarks.