🤖 AI Summary
This study identifies a critical privacy threat: publicly available statistical data—such as U.S. Census and HUD datasets—are vulnerable to reconstruction attacks that re-identify unreported occupants in subsidized housing, thereby increasing eviction risks for low-income tenants violating occupancy rules. Methodologically, the authors empirically demonstrate, for the first time, that lightweight data fusion and reconstruction techniques achieve high-accuracy identification of noncompliant households using real 2010 data. Comparative evaluations show that differentially private sanitization substantially mitigates such attacks, whereas conventional random swapping offers negligible protection. Further analysis confirms that the differential privacy mechanism deployed in the 2020 Census significantly reduces reconstruction accuracy. The work establishes a concrete, empirically grounded privacy risk posed by statistical disclosure to marginalized populations and provides actionable, privacy-enhancing strategies—offering both empirical evidence and methodological guidance for secure governance of low-income housing data.
📝 Abstract
As the U.S. Census Bureau implements its controversial new disclosure avoidance system, researchers and policymakers debate the necessity of new privacy protections for public statistics. With experiments on both published statistics and synthetic data, we explore a particular privacy concern: respondents in subsidized housing may deliberately not mention unauthorized children and other household members for fear of being evicted. By combining public statistics from the Decennial Census and the Department of Housing and Urban Development, we demonstrate a simple, inexpensive reconstruction attack that could identify subsidized households living in violation of occupancy guidelines in 2010. Experiments on synthetic data suggest that a random swapping mechanism similar to the Census Bureau's 2010 disclosure avoidance measures does not significantly reduce the precision of this attack, while a differentially private mechanism similar to the 2020 disclosure avoidance system does. Our results provide a valuable example for policymakers seeking a trustworthy, accurate census.