🤖 AI Summary
This study addresses the critical gap in high-fidelity joint representations of building structures and household demographic characteristics, which hinders accurate modeling of human–environment interactions. To overcome this limitation, the authors propose a three-stage framework: first generating synthetic populations from ACS PUMS data, then modeling dwelling–household compatibility via deep contrastive learning, and finally achieving precise individual-to-housing-unit matching through a hierarchical optimization algorithm that integrates building capacity constraints with neighborhood-level population controls. This approach constitutes the first high-fidelity joint modeling of building-level housing and household-level population data, enabling unbiased, cross-scale reconstruction of urban and rural population spatial distributions. Evaluation in North Carolina’s coastal region demonstrates the method’s ability to accurately reproduce census-block-level demographic distributions and spatial patterns, with high compatibility prediction accuracy and consistent allocation quality across urban and rural areas.
📝 Abstract
Accurately understanding the interactions between humans and the built environment requires integrated representations of both the buildings and the populations that occupy them. However, high-fidelity datasets that jointly capture detailed housing structures and demographic characteristics at the household level do not currently exist. This paper presents a framework for constructing a joint housing-household inventory that explicitly links individuals and households to compatible housing units from the National Structure Inventory (NSI), while preserving realistic population densities and demographic distributions. The framework integrates three components: (i) synthetic population generation from American Community Survey (ACS) Public Use Microdata Sample (PUMS) records that preserve complex intra-household relationships; (ii) a deep contrastive learning model that quantifies housing-household compatibility; and (iii) a hierarchical optimization-based allocation procedure that enforces building-level capacity and block-group-level demographic constraints. The generated synthetic population attains high statistical realism relative to the census microdata, and the contrastive learning model identifies compatible housing-household pairs with high predictive accuracy. Applied to coastal North Carolina, evaluations at building, neighborhood, and regional scales show that the joint inventory matches block-group-level demographic distributions, reproduces observed spatial population patterns without systematic bias, and maintains consistent allocation quality across urban, suburban, and rural contexts. By enabling coupled household- and building-level analyses, the resulting inventory supports a broad range of applications, including disaster resilience planning, housing and affordability analysis, energy-use assessment, and public health research.