On the Relation of State Space Models and Hidden Markov Models

📅 2026-01-19

📈 Citations: 0

✨ Influential: 0

career value

227K/year

🤖 AI Summary

This study systematically clarifies the similarities, differences, and theoretical connections between state space models (SSMs) and hidden Markov models (HMMs) in sequence modeling. By employing a unified probabilistic graphical model framework, it compares classical probabilistic SSMs—including linear Gaussian SSMs and Kalman filtering—with HMMs and modern neural SSMs, analyzing their structural properties, inference algorithms, and learning mechanisms. The work identifies precise conditions under which these models are equivalent or fundamentally distinct, establishing formal correspondences across representation, inference, and training paradigms. In doing so, it provides a principled probabilistic interpretation of modern neural SSMs, bridges conceptual gaps among control theory, probabilistic modeling, and deep learning, and offers theoretical guidance for informed model selection and design.

Technology Category

Application Category

📝 Abstract

State Space Models (SSMs) and Hidden Markov Models (HMMs) are foundational frameworks for modeling sequential data with latent variables and are widely used in signal processing, control theory, and machine learning. Despite their shared temporal structure, they differ fundamentally in the nature of their latent states, probabilistic assumptions, inference procedures, and training paradigms. Recently, deterministic state space models have re-emerged in natural language processing through architectures such as S4 and Mamba, raising new questions about the relationship between classical probabilistic SSMs, HMMs, and modern neural sequence models. In this paper, we present a unified and systematic comparison of HMMs, linear Gaussian state space models, Kalman filtering, and contemporary NLP state space models. We analyze their formulations through the lens of probabilistic graphical models, examine their inference algorithms -- including forward-backward inference and Kalman filtering -- and contrast their learning procedures via Expectation-Maximization and gradient-based optimization. By highlighting both structural similarities and semantic differences, we clarify when these models are equivalent, when they fundamentally diverge, and how modern NLP SSMs relate to classical probabilistic models. Our analysis bridges perspectives from control theory, probabilistic modeling, and modern deep learning.

Problem

Research questions and friction points this paper is trying to address.

State Space Models

Hidden Markov Models

sequential data

latent variables

probabilistic modeling

Innovation

Methods, ideas, or system contributions that make the work stand out.

State Space Models

Hidden Markov Models

Kalman Filtering