🤖 AI Summary
In SINDy-type sparse system identification, dictionary selection lacks theoretical grounding, compromising model accuracy and interpretability.
Method: This paper proposes an adaptive dictionary pruning strategy based on projection error scores. It establishes, for the first time, a unified theoretical framework for score-driven dictionary selection, jointly modeling reconstruction error and dictionary mutual coherence. The framework integrates sequential thresholded least squares (STLS), ℓ₀ sparse optimization, and proximal gradient methods directly into the SINDy pipeline.
Contributions/Results: (1) It enables interpretable, data-adaptive sparse regression; (2) it significantly improves model accuracy and physical consistency in both ordinary and partial differential equation system identification tasks; and (3) it enhances robustness to noise while providing reusable, principled criteria for dictionary refinement.
📝 Abstract
In this work, we revisit dictionary-based sparse regression, in particular, Sequential Threshold Least Squares (STLS), and propose a score-guided library selection to provide practical guidance for data-driven modeling, with emphasis on SINDy-type algorithms. STLS is an algorithm to solve the $ell_0$ sparse least-squares problem, which relies on splitting to efficiently solve the least-squares portion while handling the sparse term via proximal methods. It produces coefficient vectors whose components depend on both the projected reconstruction errors, here referred to as the scores, and the mutual coherence of dictionary terms. The first contribution of this work is a theoretical analysis of the score and dictionary-selection strategy. This could be understood in both the original and weak SINDy regime. Second, numerical experiments on ordinary and partial differential equations highlight the effectiveness of score-based screening, improving both accuracy and interpretability in dynamical system identification. These results suggest that integrating score-guided methods to refine the dictionary more accurately may help SINDy users in some cases to enhance their robustness for data-driven discovery of governing equations.