🤖 AI Summary
This paper addresses the core challenge of modeling implicit matching relationships between housing units and household characteristics in the absence of ground-truth annotations. We propose an unsupervised dual-encoder contrastive learning framework. Methodologically, we introduce a novel architecture integrating bipartite K-means self-supervised clustering with a semantic-decoupled dual-encoder, enabling precise alignment of heterogeneous entities (buildings vs. populations) without labeled data. Leveraging SHAP-based interpretability analysis and synthetic ground-truth validation, we identify tenure and mortgage attributes—as opposed to conventional scale-based features—as the most discriminative predictors. Empirically, our method achieves significant improvements over state-of-the-art baselines on Delaware data and demonstrates robust cross-state generalization to North Carolina. This work bridges a critical gap in socioeconomic research by enabling joint housing-household modeling without supervision, establishing a new paradigm for unsupervised spatial socioeconomic data analysis.
📝 Abstract
Housing and household characteristics are key determinants of social and economic well-being, yet our understanding of their interrelationships remains limited. This study addresses this knowledge gap by developing a deep contrastive learning (DCL) model to infer housing-household relationships using the American Community Survey (ACS) Public Use Microdata Sample (PUMS). More broadly, the proposed model is suitable for a class of problems where the goal is to learn joint relationships between two distinct entities without explicitly labeled ground truth data. Our proposed dual-encoder DCL approach leverages co-occurrence patterns in PUMS and introduces a bisect K-means clustering method to overcome the absence of ground truth labels. The dual-encoder DCL architecture is designed to handle the semantic differences between housing (building) and household (people) features while mitigating noise introduced by clustering. To validate the model, we generate a synthetic ground truth dataset and conduct comprehensive evaluations. The model further demonstrates its superior performance in capturing housing-household relationships in Delaware compared to state-of-the-art methods. A transferability test in North Carolina confirms its generalizability across diverse sociodemographic and geographic contexts. Finally, the post-hoc explainable AI analysis using SHAP values reveals that tenure status and mortgage information play a more significant role in housing-household matching than traditionally emphasized factors such as the number of persons and rooms.