🤖 AI Summary
This paper addresses privacy-preserving release of the 2020 U.S. Census Supplemental Demographic and Housing Characteristics (S-DHC) files.
Method: It proposes the first zero-concentrated differential privacy (zCDP) framework designed for large-scale official statistics, introducing the discrete Gaussian mechanism for national-level statistical disclosure—rigorously proving its zCDP compliance—and implementing it via the Tumult Analytics platform to enable verifiable, production-grade privacy computation.
Contribution/Results: (1) The first formally verified, production-deployed zCDP algorithm; (2) a demonstrable balance between strong privacy guarantees (zCDP) and high statistical utility on real-world census infrastructure; (3) a scalable, auditable privacy-enhancing paradigm for official statistics. The solution has been operationalized in the S-DHC data release system, marking a milestone in the U.S. Census Bureau’s privacy protection practice.
📝 Abstract
This article describes the disclosure avoidance algorithm that the U.S. Census Bureau used to protect the 2020 Census Supplemental Demographic and Housing Characteristics File (S-DHC). The tabulations contain statistics of counts of U.S. persons living in certain types of households, including averages. The article describes the PHSafe algorithm, which is based on adding noise drawn from a discrete Gaussian distribution to the statistics of interest. We prove that the algorithm satisfies a well-studied variant of differential privacy, called zero-concentrated differential privacy. We then describe how the algorithm was implemented on Tumult Analytics and briefly outline the parameterization and tuning of the algorithm.