🤖 AI Summary
This work addresses the challenge of accurately modeling high-order statistical moments from data in turbulent dynamical systems by proposing a two-stage data-driven framework. In the first stage, the Finite Expression method (FEX) automatically discovers the analytical form of the deterministic dynamics without requiring a pre-specified library of candidate functions, thereby recovering nonlinear interactions and external forcing terms. In the second stage, a generative model learns the residual stochastic component to correct for model error. By integrating symbolic discovery with stochastic modeling, the approach successfully reconstructs the true dynamical structure across multiple stochastic triad systems and accurately predicts statistical moments up to fifth order, demonstrating both effectiveness and robustness.
📝 Abstract
Turbulent dynamical systems are characterized by nonlinear interactions and stochastic effects that generate coupled statistical quantities, such as non-zero higher-order moments, which are difficult to capture from data with accuracy. We propose a two-stage data-driven modeling framework that combines symbolic regression with generative models to jointly identify the governing dynamics and predict their key statistical quantities. In Stage I of the framework, the Finite Expression Method (FEX) is adopted to discover closed-form expressions of the deterministic dynamics, recovering nonlinear interaction terms and external forcing without predefined libraries. In Stage II, generative models are introduced to learn the residual stochastic components as a refined correction to the model error from the Stage I approximation, enabling accurate characterization of higher-order statistics. Theoretical analysis establishes the consistency of the symbolic estimator and quantifies the estimation error in terms of data size and numerical discretization. The model performance is verified through detailed numerical experiments on the stochastic triad models across multiple regimes, demonstrating that the framework successfully recovers interaction terms and forcing expressions, and accurately predicts statistical moments up to order five. These results highlight the potential of integrating interpretable symbolic discovery with data-driven stochastic modeling for complex turbulent systems.