Observational Auditing of Label Privacy

📅 2025-11-17

📈 Citations: 0

✨ Influential: 0

career value

168K/year

🤖 AI Summary

Existing differential privacy (DP) auditing methods rely on interventions in training data—such as sample deletion or injection of adversarial examples—rendering them computationally expensive and impractical for large-scale systems. This work proposes a **non-intrusive, observability-based auditing framework**, the first to extend DP auditing beyond membership inference to quantitatively assess privacy guarantees for **protected attributes** (e.g., class labels). Leveraging inherent randomness in data distributions, our approach measures privacy leakage without modifying the original training set. Grounded in rigorous DP theory and observability analysis, it circumvents the engineering bottlenecks of intervention-based methods. Experiments on Criteo and CIFAR-10 demonstrate that the framework accurately evaluates label-level privacy protection, achieves strong scalability, and exhibits high practical deployability in real-world ML systems.

Technology Category

Application Category

📝 Abstract

Differential privacy (DP) auditing is essential for evaluating privacy guarantees in machine learning systems. Existing auditing methods, however, pose a significant challenge for large-scale systems since they require modifying the training dataset -- for instance, by injecting out-of-distribution canaries or removing samples from training. Such interventions on the training data pipeline are resource-intensive and involve considerable engineering overhead. We introduce a novel observational auditing framework that leverages the inherent randomness of data distributions, enabling privacy evaluation without altering the original dataset. Our approach extends privacy auditing beyond traditional membership inference to protected attributes, with labels as a special case, addressing a key gap in existing techniques. We provide theoretical foundations for our method and perform experiments on Criteo and CIFAR-10 datasets that demonstrate its effectiveness in auditing label privacy guarantees. This work opens new avenues for practical privacy auditing in large-scale production environments.

Problem

Research questions and friction points this paper is trying to address.

Audits differential privacy without modifying training datasets

Evaluates label privacy beyond membership inference attacks

Enables practical privacy auditing for large-scale ML systems

Innovation

Methods, ideas, or system contributions that make the work stand out.

Observational auditing without dataset modification

Leveraging inherent randomness in data distributions

Extends privacy auditing to protected attributes

🔎 Similar Papers

No similar papers found.