🤖 AI Summary
This work addresses the low computational efficiency of continuous-time maximum a posteriori (MAP) trajectory estimation for partially observed stochastic differential equations (SDEs) on parallel architectures. The proposed method reformulates MAP estimation as a continuous-time optimal control problem driven by the Onsager–Machlup functional and introduces, for the first time, the Parallel Associative Scan (PARA-SCANS) algorithm to enable temporal parallelization. Building upon this, we derive parallel Kalman–Bucy filtering and Rauch–Tung–Striebel (RTS) smoothing, then generalize them into a parallel dual-filter smoother for nonlinear SDEs via Taylor linearization and GPU acceleration. Experiments demonstrate that the method achieves speedups of up to two orders of magnitude on GPUs while preserving the accuracy of serial algorithms. This provides a scalable, parallel solution for high-dimensional, long-horizon SDE trajectory inference.
📝 Abstract
This paper proposes a parallel-in-time method for computing continuous-time maximum-a-posteriori (MAP) trajectory estimates of the states of partially observed stochastic differential equations (SDEs), with the goal of improving computational speed on parallel architectures. The MAP estimation problem is reformulated as a continuous-time optimal control problem based on the Onsager-Machlup functional. This reformulation enables the use of a previously proposed parallel-in-time solution for optimal control problems, which we adapt to the current problem. The structure of the resulting optimal control problem admits a parallel solution based on parallel associative scan algorithms. In the linear Gaussian special case, it yields a parallel Kalman-Bucy filter and a parallel continuous-time Rauch-Tung-Striebel smoother. These linear computational methods are further extended to nonlinear continuous-time state-space models through Taylor expansions. We also present the corresponding parallel two-filter smoother. The graphics processing unit (GPU) experiments on linear and nonlinear models demonstrate that the proposed framework achieves a significant speedup in computations while maintaining the accuracy of sequential algorithms.