Privacy in Theory, Bugs in Practice: Grey-Box Auditing of Differential Privacy Libraries

📅 2026-02-19

📈 Citations: 0

✨ Influential: 0

career value

163K/year

🤖 AI Summary

This work addresses the critical gap between theoretical privacy guarantees and practical implementations in differential privacy libraries, where subtle bugs often undermine formal assurances. Existing verification techniques are either constrained by strong assumptions or ineffective at pinpointing implementation flaws. To bridge this gap, we propose Re:cord-play, a gray-box auditing framework that detects data-dependent control flow and validates sensitivity claims by executing instrumented algorithms on neighboring datasets under identical randomness. We further introduce Re:cord-play-sample, an extension enabling component-level independent statistical audits. By integrating internal state observation with statistical testing, our approach occupies a unique middle ground between formal verification and black-box evaluation. Applying our lightweight Python framework to 12 widely used open-source libraries—including Opacus and Diffprivlib—we uncovered 13 previously unknown privacy vulnerabilities, and we publicly release our tool to support community-wide auditing efforts.

Technology Category

Application Category

📝 Abstract

Differential privacy (DP) implementations are notoriously prone to errors, with subtle bugs frequently invalidating theoretical guarantees. Existing verification methods are often impractical: formal tools are too restrictive, while black-box statistical auditing is intractable for complex pipelines and fails to pinpoint the source of the bug. This paper introduces Re:cord-play, a gray-box auditing paradigm that inspects the internal state of DP algorithms. By running an instrumented algorithm on neighboring datasets with identical randomness, Re:cord-play directly checks for data-dependent control flow and provides concrete falsification of sensitivity violations by comparing declared sensitivity against the empirically measured distance between internal inputs. We generalize this to Re:cord-play-sample, a full statistical audit that isolates and tests each component, including untrusted ones. We show that our novel testing approach is both effective and necessary by auditing 12 open-source libraries, including SmartNoise SDK, Opacus, and Diffprivlib, and uncovering 13 privacy violations that impact their theoretical guarantees. We release our framework as an open-source Python package, thereby making it easy for DP developers to integrate effective, computationally inexpensive, and seamless privacy testing as part of their software development lifecycle.

Problem

Research questions and friction points this paper is trying to address.

differential privacy

privacy bugs

software verification

auditing

sensitivity violations

Innovation

Methods, ideas, or system contributions that make the work stand out.

gray-box auditing

differential privacy

sensitivity violation