🤖 AI Summary
This work addresses the challenge of predicting rare, large-scale events in scale-invariant processes—such as earthquakes and avalanches—where such events are scarce in training data, severely testing a model’s extrapolation capability. The study systematically investigates embedding scale invariance as an inductive bias into neural architectures, evaluating established approaches like U-Net and Riesz networks while introducing novel designs that integrate wavelet decomposition with graph neural networks, Fourier embedding layers, and Fourier–Mellin neural operators. Experiments on two-dimensional fractional Gaussian fields and the Abelian sandpile model demonstrate that the proposed methods significantly improve prediction performance for large-magnitude events. The analysis further reveals that spectral bias and inadequate coarse-grained representations constitute primary bottlenecks limiting the extrapolation capacity of current models.
📝 Abstract
Machine Learning (ML) has deeply changed some fields recently, like Language and Vision and we may expect it to be relevant also to the analysis of of complex systems. Here we want to tackle the question of how and to which extent can one regress scale-free processes, i.e. processes displaying power law behavior, like earthquakes or avalanches? We are interested in predicting the large ones, i.e. rare events in the training set which therefore require extrapolation capabilities of the model. For this we consider two paradigmatic problems that are statistically self-similar. The first one is a 2-dimensional fractional Gaussian field obeying linear dynamics, self-similar by construction and amenable to exact analysis. The second one is the Abelian sandpile model, exhibiting self-organized criticality. The emerging paradigm of Geometric Deep Learning shows that including known symmetries into the model's architecture is key to success. Here one may hope to extrapolate only by leveraging scale invariance. This is however a peculiar symmetry, as it involves possibly non-trivial coarse-graining operations and anomalous scaling. We perform experiments on various existing architectures like U-net, Riesz network (scale invariant by construction), or our own proposals: a wavelet-decomposition based Graph Neural Network (with discrete scale symmetry), a Fourier embedding layer and a Fourier-Mellin Neural Operator. Based on these experiments and a complete characterization of the linear case, we identify the main issues relative to spectral biases and coarse-grained representations, and discuss how to alleviate them with the relevant inductive biases.