Nonparametric learning of stochastic differential equations from sparse and noisy data

📅 2025-08-15
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the nonparametric learning of drift functions in stochastic differential equations (SDEs) from sparse and noisy observations. We propose EM-SMC-RKHS—a unified framework integrating sequential Monte Carlo (SMC) for efficient Bayesian inference of latent state trajectories, penalized likelihood estimation within a reproducing kernel Hilbert space (RKHS), and the generalized representer theorem to enable structure-free function learning; we further introduce a Bayesian shrinkage prior to automatically control model complexity. Unlike conventional parametric approaches, our method imposes no assumptions on the functional form of the drift, substantially reducing reliance on domain-specific prior knowledge. Experiments demonstrate high accuracy and robustness in drift estimation even under extremely low sampling rates and high observational noise. The framework provides an interpretable, data-efficient, nonparametric inference paradigm for modeling dynamic systems with complex or partially unknown mechanisms.

Technology Category

Application Category

📝 Abstract
The paper proposes a systematic framework for building data-driven stochastic differential equation (SDE) models from sparse, noisy observations. Unlike traditional parametric approaches, which assume a known functional form for the drift, our goal here is to learn the entire drift function directly from data without strong structural assumptions, making it especially relevant in scientific disciplines where system dynamics are partially understood or highly complex. We cast the estimation problem as minimization of the penalized negative log-likelihood functional over a reproducing kernel Hilbert space (RKHS). In the sparse observation regime, the presence of unobserved trajectory segments makes the SDE likelihood intractable. To address this, we develop an Expectation-Maximization (EM) algorithm that employs a novel Sequential Monte Carlo (SMC) method to approximate the filtering distribution and generate Monte Carlo estimates of the E-step objective. The M-step then reduces to a penalized empirical risk minimization problem in the RKHS, whose minimizer is given by a finite linear combination of kernel functions via a generalized representer theorem. To control model complexity across EM iterations, we also develop a hybrid Bayesian variant of the algorithm that uses shrinkage priors to identify significant coefficients in the kernel expansion. We establish important theoretical convergence results for both the exact and approximate EM sequences. The resulting EM-SMC-RKHS procedure enables accurate estimation of the drift function of stochastic dynamical systems in low-data regimes and is broadly applicable across domains requiring continuous-time modeling under observational constraints. We demonstrate the effectiveness of our method through a series of numerical experiments.
Problem

Research questions and friction points this paper is trying to address.

Learning stochastic differential equations from sparse noisy data
Estimating drift function without strong structural assumptions
Developing EM-SMC-RKHS algorithm for accurate drift estimation
Innovation

Methods, ideas, or system contributions that make the work stand out.

Nonparametric drift learning via RKHS minimization
EM algorithm with novel SMC approximation
Hybrid Bayesian shrinkage for model complexity
🔎 Similar Papers
No similar papers found.
Arnab Ganguly
Arnab Ganguly
Associate Professor, University of Wisconsin - Whitewater, USA
Pattern MatchingComputational GeometrySuccinct Data Structures
R
Riten Mitra
Department of Bioinformatics and Biostatistics, University of Louisville
J
Jinpu Zhou
Department of Mathematics, Louisiana State University