🤖 AI Summary
Inferring governing dynamical equations from noisy observational data remains a fundamental challenge in system identification.
Method: This paper proposes a two-level Bayesian inference framework grounded in statistical mechanics, decoupling sparse equation discovery into variable selection and coefficient estimation, and deriving closed-form analytical expressions for the posterior distribution.
Contribution/Results: We introduce, for the first time, the free energy and partition function to characterize sparsity- and noise-induced phase transitions, establishing a closed-loop inference mechanism that quantifies epistemic uncertainty and enables self-assessment and validation of model noise tolerance. The method significantly enhances robustness under small-sample regimes, precisely delineates the phase boundary for correct equation identification, and provides verifiable fault-tolerance guarantees for prescribed noise levels.
📝 Abstract
Recovering dynamical equations from observed noisy data is the central challenge of system identification. We develop a statistical mechanics approach to analyze sparse equation discovery algorithms, which typically balance data fit and parsimony via hyperparameter tuning. In this framework, statistical mechanics offers tools to analyze the interplay between complexity and fitness similarly to that of entropy and energy in physical systems. To establish this analogy, we define the hyperparameter optimization procedure as a two-level Bayesian inference problem that separates variable selection from coefficient inference and enables the computation of the posterior parameter distribution in closed form. Our approach provides uncertainty quantification, crucial in the low-data limit that is frequently encountered in real-world applications. A key advantage of employing statistical mechanical concepts, such as free energy and the partition function, is to connect the large data limit to thermodynamic limit and characterize the sparsity- and noise-induced phase transitions that delineate correct from incorrect identification. We thus provide a method for closed-loop inference, estimating the noise in a given model and checking if the model is tolerant to that noise amount. This perspective of sparse equation discovery is versatile and can be adapted to various other equation discovery algorithms.