SafeTab-P: Disclosure Avoidance for the 2020 Census Detailed Demographic and Housing Characteristics File A (Detailed DHC-A)

📅 2025-05-02
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Fine-grained race/ethnicity statistics in the 2020 U.S. Census Detailed Demographic and Housing Characteristics File A (DHC-A) pose significant privacy risks due to re-identification vulnerabilities. Method: We propose the first differentially private framework for releasing multi-level geographic and cross-tabulated group statistics, featuring an adaptive granularity control mechanism that dynamically adjusts the number of statistics and the resolution of geographic and categorical dimensions based on group size, coupled with discrete Gaussian noise injection under zero-concentrated differential privacy (zCDP). Contribution/Results: Implemented and deployed via Tumult Analytics within the official census release pipeline, our approach achieves strong formal privacy guarantees (ε = 0.48 zCDP) while substantially improving statistical utility over prior methods. Empirically tuned and budget-validated, it balances practical deployability with regulatory compliance, establishing a scalable paradigm for privacy-preserving large-scale official statistics.

Technology Category

Application Category

📝 Abstract
This article describes the disclosure avoidance algorithm that the U.S. Census Bureau used to protect the Detailed Demographic and Housing Characteristics File A (Detailed DHC-A) of the 2020 Census. The tabulations contain statistics (counts) of demographic characteristics of the entire population of the United States, crossed with detailed races and ethnicities at varying levels of geography. The article describes the SafeTab-P algorithm, which is based on adding noise drawn to statistics of interest from a discrete Gaussian distribution. A key innovation in SafeTab-P is the ability to adaptively choose how many statistics and at what granularity to release them, depending on the size of a population group. We prove that the algorithm satisfies a well-studied variant of differential privacy, called zero-concentrated differential privacy (zCDP). We then describe how the algorithm was implemented on Tumult Analytics and briefly outline the parameterization and tuning of the algorithm.
Problem

Research questions and friction points this paper is trying to address.

Develops SafeTab-P to protect 2020 Census demographic data
Adaptively releases statistics based on population group size
Ensures zero-concentrated differential privacy (zCDP) compliance
Innovation

Methods, ideas, or system contributions that make the work stand out.

Uses discrete Gaussian noise for privacy
Adaptively selects statistic granularity by group size
Implements zero-concentrated differential privacy (zCDP)
🔎 Similar Papers
No similar papers found.
S
Sam Haney
Tumult Labs
S
Skye Berghel
Tumult Labs
B
Bayard Carlson
Tumult Labs
R
Ryan Cumings-Menon
U.S. Census Bureau
L
Luke Hartman
Tumult Labs
Michael Hay
Michael Hay
Colgate University
Ashwin Machanavajjhala
Ashwin Machanavajjhala
Tumult Labs
G
G. Miklau
Tumult Labs
A
Amritha Pai
Tumult Labs
S
Simran Rajpal
Tumult Labs
David Pujol
David Pujol
Tumult Labs
PrivacyAlgorithmic fairness
W
William Sexton
Tumult Labs
R
Ruchit Shrestha
Tumult Labs
D
Daniel Simmons-Marengo
Tumult Labs