Estimating stationary mass, frequency by frequency

📅 2025-03-17
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This paper addresses the problem of estimating the stationary distribution—i.e., the probability mass vector over states—of an α-mixing stochastic process from a single trajectory of length *n*, using empirical state frequencies. The estimation error is measured in total variation distance. Methodologically, we extend the WingIt estimator to α-mixing processes for the first time; propose a novel hybrid strategy combining plug-in estimation with WingIt; and derive a self-normalized concentration inequality tailored to mixing sequences, circumventing the failure of Poissonization under non-i.i.d. dependence. Theoretically, our estimator achieves universal consistency as *n* → ∞ for arbitrary finite state spaces and general α-mixing processes. It recovers existing i.i.d. results in the degenerate case and provides the first frequency-to-mass estimation framework for Markov and broader dependent processes with rigorous theoretical guarantees.

Technology Category

Application Category

📝 Abstract
Suppose we observe a trajectory of length $n$ from an $alpha$-mixing stochastic process over a finite but potentially large state space. We consider the problem of estimating the probability mass placed by the stationary distribution of any such process on elements that occur with a certain frequency in the observed sequence. We estimate this vector of probabilities in total variation distance, showing universal consistency in $n$ and recovering known results for i.i.d. sequences as special cases. Our proposed methodology carefully combines the plug-in (or empirical) estimator with a recently-proposed modification of the Good--Turing estimator called extsc{WingIt}, which was originally developed for Markovian sequences. En route to controlling the error of our estimator, we develop new performance bounds on extsc{WingIt} and the plug-in estimator for $alpha$-mixing stochastic processes. Importantly, the extensively used method of Poissonization can no longer be applied in our non i.i.d. setting, and so we develop complementary tools -- including concentration inequalities for a natural self-normalized statistic of mixing sequences -- that may prove independently useful in the design and analysis of estimators for related problems.
Problem

Research questions and friction points this paper is trying to address.

Estimating stationary mass for α-mixing processes
Combining plug-in and WingIt estimators effectively
Developing new bounds for non-i.i.d. sequence analysis
Innovation

Methods, ideas, or system contributions that make the work stand out.

Combines plug-in estimator with WingIt
Develops new bounds for α-mixing processes
Uses self-normalized concentration inequalities
M
Milind Nakul
H. Milton Stewart School of Industrial and Systems Engineering, Georgia Institute of Technology
Vidya Muthukumar
Vidya Muthukumar
Georgia Institute of Technology
machine learning theoryonline decision-makinggame theory
A
Ashwin Pananjady
H. Milton Stewart School of Industrial and Systems Engineering, School of Electrical and Computer Engineering, Georgia Institute of Technology