🤖 AI Summary
For sequential inference in highly nonlinear state-space models, this paper proposes StateMixNN: an end-to-end learning framework built upon differentiable particle filtering. The method introduces a dual neural network architecture—where one network parameterizes the proposal distribution and the other the state transition distribution—both unified as differentiable Gaussian mixture models (GMMs). This constitutes the first approach to jointly parameterize both distributions and optimize the log-likelihood end-to-end via gradient-based learning. Unlike conventional particle filters or black-box deep state estimators, StateMixNN preserves distributional interpretability while substantially enhancing expressiveness for strong nonlinearities. Empirical evaluation on multiple challenging nonlinear benchmark tasks demonstrates that StateMixNN achieves superior latent state recovery accuracy compared to current state-of-the-art methods.
📝 Abstract
State-space models are a popular statistical framework for analysing sequential data. Within this framework, particle filters are often used to perform inference on non-linear state-space models. We introduce a new method, StateMixNN, that uses a pair of neural networks to learn the proposal distribution and transition distribution of a particle filter. Both distributions are approximated using multivariate Gaussian mixtures. The component means and covariances of these mixtures are learnt as outputs of learned functions. Our method is trained targeting the log-likelihood, thereby requiring only the observation series, and combines the interpretability of state-space models with the flexibility and approximation power of artificial neural networks. The proposed method significantly improves recovery of the hidden state in comparison with the state-of-the-art, showing greater improvement in highly non-linear scenarios.