π€ AI Summary
This work exposes the cumulative privacy leakage risk in continuous data release settings: even when each individual release satisfies standard privacy guarantees (e.g., differential privacy), adversaries can exploit temporal dependencies across releases to infer sensitive information. To address this, we propose HMM-RLβthe first bidirectional inference attack framework integrating Hidden Markov Models (HMMs) with reinforcement learning. HMMs capture latent state transitions in sequential data, while reinforcement learning jointly optimizes forward prediction and backward traceback to refine inference policies. Experiments on Geolife, Porto Taxi, and SynMob trajectory datasets demonstrate that HMM-RL significantly outperforms independent-release baselines, confirming that temporal correlations lead to substantial exhaustion of the privacy budget. The study not only quantifies privacy decay in sequential releases but also establishes a novel paradigm for assessing dynamic privacy risks in evolving data publication scenarios.
π Abstract
Privacy concerns have become increasingly critical in modern AI and data science applications, where sensitive information is collected, analyzed, and shared across diverse domains such as healthcare, finance, and mobility. While prior research has focused on protecting privacy in a single data release, many real-world systems operate under sequential or continuous data publishing, where the same or related data are released over time. Such sequential disclosures introduce new vulnerabilities, as temporal correlations across releases may enable adversaries to infer sensitive information that remains hidden in any individual release. In this paper, we investigate whether an attacker can compromise privacy in sequential data releases by exploiting dependencies between consecutive publications, even when each individual release satisfies standard privacy guarantees. To this end, we propose a novel attack model that captures these sequential dependencies by integrating a Hidden Markov Model with a reinforcement learning-based bi-directional inference mechanism. This enables the attacker to leverage both earlier and later observations in the sequence to infer private information. We instantiate our framework in the context of trajectory data, demonstrating how an adversary can recover sensitive locations from sequential mobility datasets. Extensive experiments on Geolife, Porto Taxi, and SynMob datasets show that our model consistently outperforms baseline approaches that treat each release independently. The results reveal a fundamental privacy risk inherent to sequential data publishing, where individually protected releases can collectively leak sensitive information when analyzed temporally. These findings underscore the need for new privacy-preserving frameworks that explicitly model temporal dependencies, such as time-aware differential privacy or sequential data obfuscation strategies.