🤖 AI Summary
This study systematically evaluates the privacy guarantees of the Post-Processing Swapping Algorithm (PSA) and the Top-Down Algorithm (TDA) under differential privacy (DP) in the context of the U.S. Decennial Census, where both algorithms must preserve critical statistical invariants (e.g., total population, racial/ethnic distributions).
Method: We formally prove, for the first time, that PSA satisfies ε-differential privacy under natural invariants; we further extend ρ-zero concentrated DP to accommodate TDA’s structure, enabling rigorous privacy analysis while respecting invariant constraints.
Contribution/Results: We establish the first unified theoretical framework that jointly ensures invariant preservation and formal DP guarantees. Our analysis reveals that—had PSA been deployed in the 2020 Census—although the nominal privacy loss (ε) would decrease, more invariant information would be released, undermining privacy. This underscores that DP specifications must explicitly incorporate real-world statistical constraints to yield meaningful interpretations. The work provides a verifiable theoretical foundation and practical guidance for privacy–utility trade-offs in official statistics.
📝 Abstract
Through the lens of the system of differential privacy specifications developed in Part I of a trio of articles, this second paper examines two statistical disclosure control (SDC) methods for the United States Decennial Census: the Permutation Swapping Algorithm (PSA), which is similar to the 2010 Census's disclosure avoidance system (DAS), and the TopDown Algorithm (TDA), which was used in the 2020 DAS. To varying degrees, both methods leave unaltered some statistics of the confidential data $unicode{x2013}$ which are called the method's invariants $unicode{x2013}$ and hence neither can be readily reconciled with differential privacy (DP), at least as it was originally conceived. Nevertheless, we establish that the PSA satisfies $varepsilon$-DP subject to the invariants it necessarily induces, thereby showing that this traditional SDC method can in fact still be understood within our more-general system of DP specifications. By a similar modification to $
ho$-zero concentrated DP, we also provide a DP specification for the TDA. Finally, as a point of comparison, we consider the counterfactual scenario in which the PSA was adopted for the 2020 Census, resulting in a reduction in the nominal privacy loss, but at the cost of releasing many more invariants. Therefore, while our results explicate the mathematical guarantees of SDC provided by the PSA, the TDA and the 2020 DAS in general, care must be taken in their translation to actual privacy protection $unicode{x2013}$ just as is the case for any DP deployment.