Scalable Machine Learning Algorithms using Path Signatures

📅 2025-06-21
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses key challenges in modeling time series and graph-structured data—namely, dynamic evolution, long-range dependencies, and irregular sampling. Methodologically, it introduces a scalable machine learning framework grounded in path signatures, integrating rough path theory with modern learning paradigms: (i) a signature-kernel Gaussian process for robust uncertainty quantification in time series; (ii) Seq2Tens, a low-rank tensor model enabling efficient sequence-to-tensor mapping; and (iii) an expected signature-driven diffusion process on graphs, coupled with a recursive sparse spectral method, as a principled alternative to conventional graph neural networks. The framework unifies path signatures, kernel methods, random Fourier features, and low-rank decomposition—ensuring theoretical soundness while achieving linear scalability. Experiments demonstrate state-of-the-art performance in multi-step time series forecasting and graph learning tasks, with strong expressivity, intrinsic interpretability, and computational efficiency.

Technology Category

Application Category

📝 Abstract
The interface between stochastic analysis and machine learning is a rapidly evolving field, with path signatures - iterated integrals that provide faithful, hierarchical representations of paths - offering a principled and universal feature map for sequential and structured data. Rooted in rough path theory, path signatures are invariant to reparameterization and well-suited for modelling evolving dynamics, long-range dependencies, and irregular sampling - common challenges in real-world time series and graph data. This thesis investigates how to harness the expressive power of path signatures within scalable machine learning pipelines. It introduces a suite of models that combine theoretical robustness with computational efficiency, bridging rough path theory with probabilistic modelling, deep learning, and kernel methods. Key contributions include: Gaussian processes with signature kernel-based covariance functions for uncertainty-aware time series modelling; the Seq2Tens framework, which employs low-rank tensor structure in the weight space for scalable deep modelling of long-range dependencies; and graph-based models where expected signatures over graphs induce hypo-elliptic diffusion processes, offering expressive yet tractable alternatives to standard graph neural networks. Further developments include Random Fourier Signature Features, a scalable kernel approximation with theoretical guarantees, and Recurrent Sparse Spectrum Signature Gaussian Processes, which combine Gaussian processes, signature kernels, and random features with a principled forgetting mechanism for multi-horizon time series forecasting with adaptive context length. We hope this thesis serves as both a methodological toolkit and a conceptual bridge, and provides a useful reference for the current state of the art in scalable, signature-based learning for sequential and structured data.
Problem

Research questions and friction points this paper is trying to address.

Develop scalable machine learning models using path signatures
Address challenges in time series and graph data analysis
Bridge rough path theory with modern machine learning techniques
Innovation

Methods, ideas, or system contributions that make the work stand out.

Gaussian processes with signature kernel-based covariance
Seq2Tens framework for scalable deep modelling
Random Fourier Signature Features for kernel approximation
🔎 Similar Papers
2022-06-29arXiv.orgCitations: 23