🤖 AI Summary
This work addresses the challenge of achieving long-term, reliable dynamical forecasting in ultra-large-scale complex systems—such as climate, biological, and technological networks—while preserving model interpretability. The authors propose Sparse Identification Graph Neural Networks (SIGN), which formulate symbolic equation discovery as an edge-wise sparse regression problem on graphs, thereby decoupling equation discovery from system size. SIGN is the first method to efficiently and robustly infer governing equations directly from data at scales exceeding one hundred thousand nodes, balancing interpretability with scalability. It further incorporates mechanisms for noise robustness, sparse sampling, and missing data handling. Experiments demonstrate that SIGN accurately recovers governing equations and enables precise long-horizon predictions across multiple benchmark systems; notably, on real-world sea surface temperature data comprising 71,987 spatial points, it constructs a compact model capable of capturing large-scale oceanic temperature variations up to two years in advance.
📝 Abstract
Predicting the behavior of ultra-large complex systems, from climate to biological and technological networks, is a central unsolved challenge. Existing approaches face a fundamental trade-off: equation discovery methods provide interpretability but fail to scale, while neural networks scale but operate as black boxes and often lose reliability over long times. Here, we introduce the Sparse Identification Graph Neural Network, a framework that overcome this divide by allowing to infer the governing equations of large networked systems from data. By defining symbolic discovery as edge-level information, SIGN decouples the scalability of sparse identification from network size, enabling efficient equation discovery even in large systems. SIGN allows to study networks with over 100,000 nodes while remaining robust to noise, sparse sampling, and missing data. Across diverse benchmark systems, including coupled chaotic oscillators, neural dynamics, and epidemic spreading, it recovers governing equations with high precision and sustains accurate long-term predictions. Applied to a data set of time series of temperature measurements in 71,987 sea surface positions, SIGN identifies a compact predictive network model and captures large-scale sea surface temperature conditions up to two years in advance. By enabling equation discovery at previously inaccessible scales, SIGN opens a path toward interpretable and reliable prediction of real-world complex systems.