Quick Adaptive Ternary Segmentation: An Efficient Decoding Procedure For Hidden Markov Models

📅 2023-05-29
🏛️ arXiv.org
📈 Citations: 1
Influential: 0
📄 PDF
🤖 AI Summary
Traditional decoding algorithms (e.g., Viterbi) for large-scale Hidden Markov Models (HMMs) suffer from high computational complexity—O(TN²) in sequence length T and state count N—rendering them intractable for long sequences and unscalable in practice. Method: We propose Quick Adaptive Three-segment Splitting (QATS), a divide-and-conquer framework that introduces a novel local three-segment maximum-likelihood approximation paradigm. QATS integrates cumulative-sum preprocessing with adaptive search to achieve O(N² log T) quasi-logarithmic time complexity, breaking the linear bottleneck while rigorously preserving transition constraints. Contribution/Results: Implemented in C++ and released as an open-source R package on GitHub, QATS accelerates decoding by 1–2 orders of magnitude over Viterbi on long sequences. Monte Carlo evaluations confirm its robust accuracy. Our work establishes a new scalable decoding paradigm and a theory-guided, engineering-optimized framework for HMM inference.
📝 Abstract
Hidden Markov models (HMMs) are characterized by an unobservable (hidden) Markov chain and an observable process, which is a noisy version of the hidden chain. Decoding the original signal (i.e., hidden chain) from the noisy observations is one of the main goals in nearly all HMM based data analyses. Existing decoding algorithms such as the Viterbi algorithm have computational complexity at best linear in the length of the observed sequence, and sub-quadratic in the size of the state space of the Markov chain. We present Quick Adaptive Ternary Segmentation (QATS), a divide-and-conquer procedure which decodes the hidden sequence in polylogarithmic computational complexity in the length of the sequence, and cubic in the size of the state space, hence particularly suited for large scale HMMs with relatively few states. The procedure also suggests an effective way of data storage as specific cumulative sums. In essence, the estimated sequence of states sequentially maximizes local likelihood scores among all local paths with at most three segments. The maximization is performed only approximately using an adaptive search procedure. The resulting sequence is admissible in the sense that all transitions occur with positive probability. To complement formal results justifying our approach, we present Monte-Carlo simulations which demonstrate the speedups provided by QATS in comparison to Viterbi, along with a precision analysis of the returned sequences. An implementation of QATS in C++ is provided in the R-package QATS and is available from GitHub.
Problem

Research questions and friction points this paper is trying to address.

Efficiently decodes hidden states from noisy HMM observations
Reduces computational complexity for large-scale HMM decoding
Approximates optimal segmentation using adaptive ternary search procedure
Innovation

Methods, ideas, or system contributions that make the work stand out.

Divide-and-conquer with polylogarithmic complexity
Maximizes local likelihood with three segments
Uses adaptive search for approximate maximization
🔎 Similar Papers
No similar papers found.