Improving Spatio-Temporal Residual Error Propagation by Mitigating Over-Squashing

📅 2026-05-18
📈 Citations: 0
Influential: 0
📄 PDF

career value

175K/year
🤖 AI Summary
This work addresses the performance degradation of recurrent temporal models in long-term forecasting, which often stems from accumulated residual errors and the neglect of spatial correlations among residuals. To mitigate these issues, the authors propose the Teger module, which introduces discrete Forman curvature to identify information bottleneck edges and dynamically rewires the graph structure, thereby alleviating over-compression. Additionally, a low-rank plus diagonal covariance head is designed, leveraging the Woodbury identity to efficiently model spatiotemporal residual correlations and enhance uncertainty quantification. Theoretical analysis reveals intrinsic connections between curvature-based rewiring and spectral connectivity, effective resistance, and covariance calibration. Teger is implemented as a plug-and-play component compatible with backbone architectures such as LSTM and Transformer, and consistently achieves significant CRPS reductions across four real-world spatiotemporal datasets, demonstrating its robust and generalizable improvements.
📝 Abstract
Residual error propagation remains a fundamental problem in recurrent models, where small prediction inaccuracies compound over time and degrade long-horizon performance. Accurately modeling the correlation structure of such residuals is critical for reliable uncertainty quantification in probabilistic multivariate timeseries forecasting. While recent time-series deep models efficiently parametrize time-varying contemporaneous correlations, they often assume temporal independence of errors and neglect spatial correlation across the observed network. In this paper, we introduce Teger, a structured uncertainty module that overcomes the spa- tial and temporal limitations of error-correlated autoregressive forecasting. Teger proposes a spatial curvature-aware graph rewiring mechanism explicitly strengthening information-bottleneck edges identified by discrete Forman curvature. The component is integrated into a low-rank-plus-diagonal covariance head, preserving tractable inference via the Woodbury identity. Teger is backbone-agnostic, requiring only the latent state produced by any autoregressive encoder. We provide theoretical evidence of Teger, and experimentally evaluate it on LSTM, Transformer, and xLSTM backbones across four real-world spatio-temporal datasets, showing consistent improvement in Continuous Ranked Probability Score (CRPS). We further provide a formal theoretical analysis connecting curvature-aware rewiring to (i) oversquashing alleviation, (ii) improved spectral connectivity, (iii) reduced effective resistance, and (iv) improved covariance calibration bounds
Problem

Research questions and friction points this paper is trying to address.

residual error propagation
spatio-temporal forecasting
uncertainty quantification
over-squashing
error correlation
Innovation

Methods, ideas, or system contributions that make the work stand out.

over-squashing mitigation
curvature-aware graph rewiring
structured uncertainty
spatio-temporal forecasting
residual error propagation
🔎 Similar Papers