Learning to Attack: Uncovering Privacy Risks in Sequential Data Releases

📅 2025-10-28

📈 Citations: 0

✨ Influential: 0

career value

217K/year

🤖 AI Summary

This work exposes the cumulative privacy leakage risk in continuous data release settings: even when each individual release satisfies standard privacy guarantees (e.g., differential privacy), adversaries can exploit temporal dependencies across releases to infer sensitive information. To address this, we propose HMM-RL—the first bidirectional inference attack framework integrating Hidden Markov Models (HMMs) with reinforcement learning. HMMs capture latent state transitions in sequential data, while reinforcement learning jointly optimizes forward prediction and backward traceback to refine inference policies. Experiments on Geolife, Porto Taxi, and SynMob trajectory datasets demonstrate that HMM-RL significantly outperforms independent-release baselines, confirming that temporal correlations lead to substantial exhaustion of the privacy budget. The study not only quantifies privacy decay in sequential releases but also establishes a novel paradigm for assessing dynamic privacy risks in evolving data publication scenarios.

Technology Category

Application Category

📝 Abstract

Privacy concerns have become increasingly critical in modern AI and data science applications, where sensitive information is collected, analyzed, and shared across diverse domains such as healthcare, finance, and mobility. While prior research has focused on protecting privacy in a single data release, many real-world systems operate under sequential or continuous data publishing, where the same or related data are released over time. Such sequential disclosures introduce new vulnerabilities, as temporal correlations across releases may enable adversaries to infer sensitive information that remains hidden in any individual release. In this paper, we investigate whether an attacker can compromise privacy in sequential data releases by exploiting dependencies between consecutive publications, even when each individual release satisfies standard privacy guarantees. To this end, we propose a novel attack model that captures these sequential dependencies by integrating a Hidden Markov Model with a reinforcement learning-based bi-directional inference mechanism. This enables the attacker to leverage both earlier and later observations in the sequence to infer private information. We instantiate our framework in the context of trajectory data, demonstrating how an adversary can recover sensitive locations from sequential mobility datasets. Extensive experiments on Geolife, Porto Taxi, and SynMob datasets show that our model consistently outperforms baseline approaches that treat each release independently. The results reveal a fundamental privacy risk inherent to sequential data publishing, where individually protected releases can collectively leak sensitive information when analyzed temporally. These findings underscore the need for new privacy-preserving frameworks that explicitly model temporal dependencies, such as time-aware differential privacy or sequential data obfuscation strategies.

Problem

Research questions and friction points this paper is trying to address.

Investigating privacy risks in sequential data releases

Proposing attack model exploiting temporal data dependencies

Demonstrating sensitive information leakage from protected datasets

Innovation

Methods, ideas, or system contributions that make the work stand out.

Hidden Markov Model captures sequential data dependencies

Reinforcement learning enables bi-directional inference mechanism

Attack model exploits temporal correlations in data releases

🔎 Similar Papers

No similar papers found.