Differentially Private Data-Driven Markov Chain Modeling

📅 2026-02-25

📈 Citations: 0

✨ Influential: 0

career value

199K/year

🤖 AI Summary

This work addresses the challenge of accurately modeling data-driven Markov chains while preserving user privacy. The authors propose a novel differential privacy mechanism that, for the first time, applies privacy-preserving perturbations to database queries within the simplex space and extends this approach to stochastic transition matrices whose rows are probability simplices. Through rigorous theoretical analysis, they characterize the impact of privacy-inducing noise on the stationary distribution and convergence rate of the resulting Markov chain, deriving error bounds in terms of Kullback–Leibler divergence. Experimental results demonstrate that under standard privacy parameters, the stationary distribution incurs less than 2% error, effectively preserving the behavioral characteristics of the original system even under strong privacy guarantees.

Technology Category

Application Category

📝 Abstract

Markov chains model a wide range of user behaviors. However, generating accurate Markov chain models requires substantial user data, and sharing these models without privacy protections may reveal sensitive information about the underlying user data. We introduce a method for protecting user data used to formulate a Markov chain model. First, we develop a method for privatizing database queries whose outputs are elements of the unit simplex, and we prove that this method is differentially private. We quantify its accuracy by bounding the expected KL divergence between private and non-private queries. We extend this method to privatize stochastic matrices whose rows are each a simplex-valued query of a database, which includes data-driven Markov chain models. To assess their accuracy, we analytically bound the change in the stationary distribution and the change in the convergence rate between a non-private Markov chain model and its private form. Simulations show that under a typical privacy implementation, our method yields less than 2% error in the stationary distribution, indicating that our approach to private modeling faithfully captures the behavior of the systems we study.

Problem

Research questions and friction points this paper is trying to address.

Differential Privacy

Markov Chain

User Data Privacy

Stochastic Matrix

Data-Driven Modeling

Innovation

Methods, ideas, or system contributions that make the work stand out.

Differential Privacy

Markov Chain

Simplex-Valued Query