SafeTab-H: Disclosure Avoidance for the 2020 Census Detailed Demographic and Housing Characteristics File B (Detailed DHC-B)

πŸ“… 2025-05-02
πŸ“ˆ Citations: 0
✨ Influential: 0
πŸ“„ PDF
πŸ€– AI Summary
This work addresses the privacy-preserving release of the 2020 U.S. Census Detailed Demographic and Housing Characteristics File B (DHC-B), a high-dimensional, nationally representative dataset with complex hierarchical geography and household-level attributes (e.g., household type, tenure status, householder race/ethnicity/tribal affiliation). Method: We design and deploy the first zero-concentrated differential privacy (zCDP) system for national census data release. Our approach introduces a discrete Gaussian noise injection mechanism tailored to multi-level geographic nesting and fine-grained household statistics, providing rigorous zCDP guarantees. Built atop the Tumult Analytics privacy computing library, the system implements a scalable tabulation pipeline with theoretically bounded error. Contribution/Results: The system achieves a superior utility-privacy trade-off and has been adopted to produce the official DHC-B data productsβ€”the first successful large-scale deployment of zCDP for detailed national census data release.

Technology Category

Application Category

πŸ“ Abstract
This article describes SafeTab-H, a disclosure avoidance algorithm applied to the release of the U.S. Census Bureau's Detailed Demographic and Housing Characteristics File B (Detailed DHC-B) as part of the 2020 Census. The tabulations contain household statistics about household type and tenure iterated by the householder's detailed race, ethnicity, or American Indian and Alaska Native tribe and village at varying levels of geography. We describe the algorithmic strategy which is based on adding noise from a discrete Gaussian distribution and show that the algorithm satisfies a well-studied variant of differential privacy, called zero-concentrated differential privacy. We discuss how the implementation of the SafeTab-H codebase relies on the Tumult Analytics privacy library. We also describe the theoretical expected error properties of the algorithm and explore various aspects of its parameter tuning.
Problem

Research questions and friction points this paper is trying to address.

Develops SafeTab-H for Census data privacy
Ensures differential privacy in demographic statistics
Optimizes algorithm parameters for accuracy
Innovation

Methods, ideas, or system contributions that make the work stand out.

Uses discrete Gaussian noise addition
Implements zero-concentrated differential privacy
Relies on Tumult Analytics privacy library
πŸ”Ž Similar Papers
W
William Sexton
Tumult Labs
S
Skye Berghel
Tumult Labs
B
Bayard Carlson
Tumult Labs
S
Sam Haney
Tumult Labs
L
Luke Hartman
Tumult Labs
Michael Hay
Michael Hay
Colgate University
Ashwin Machanavajjhala
Ashwin Machanavajjhala
Tumult Labs
G
G. Miklau
Tumult Labs
A
Amritha Pai
Tumult Labs
S
Simran Rajpal
Tumult Labs
David Pujol
David Pujol
Tumult Labs
PrivacyAlgorithmic fairness
R
Ruchit Shrestha
Tumult Labs
D
Daniel Simmons-Marengo
Tumult Labs