Linking COPD Prevalence with Income Distribution: A Spatial Heterogeneous Compositional Regression via Geographically Weighted Penalized Approach

📅 2026-05-12
📈 Citations: 0
Influential: 0
📄 PDF

career value

179K/year
🤖 AI Summary
Existing spatial regression methods struggle to effectively model health disparities characterized by strong geographic heterogeneity, compositional income data, and discontinuous regional effects. This study proposes a geographically weighted penalized compositional regression model that innovatively integrates pairwise fusion penalties with the minimax concave penalty (MCP), thereby relaxing conventional assumptions of spatial smoothness and adjacency. The approach identifies clusters of regions with similar socioeconomic structures, even when those regions are not geographically contiguous. Notably, this work is the first to introduce non-convex regularization into compositional data regression, enabling precise capture of discontinuous spatial heterogeneity. Applied to the analysis of COPD prevalence and income composition in the United States, the method uncovers heterogeneous associations obscured by traditional models, substantially enhancing accuracy, interpretability, and scalability.
📝 Abstract
Income inequality is a major contributor to health disparities, yet its effects often vary by geography and are commonly represented as compositional distributions (e.g., proportions of households across income brackets). Existing spatial regression methods struggle in this setting: they typically assume smooth spatial variation, cannot accommodate abrupt spatial heterogeneity, and lack principled treatment of compositional covariates. We propose a geographically weighted penalized compositional regression model that addresses these challenges simultaneously. Our method adopts a pairwise fusion penalty that enables detection of both contiguous and noncontiguous regional clusters with shared regression effects, thereby relaxing strong assumptions of spatial smoothness and geographic contiguity. This allows regions with similar underlying socioeconomic structures to be identified even when they are not geographically adjacent. By incorporating nonconvex penalties, such as the minimax concave penalty (MCP), the approach achieves improved estimation accuracy, interpretability, and scalability in high-dimensional spatial settings. We illustrate the method through an analysis linking U.S. income composition to chronic obstructive pulmonary disease (COPD) prevalence, revealing spatially heterogeneous associations that are obscured by conventional models. The proposed framework provides a flexible and robust tool for spatial data analysis involving compositional predictors and region-specific heterogeneity.
Problem

Research questions and friction points this paper is trying to address.

spatial heterogeneity
compositional data
income inequality
COPD prevalence
geographic variation
Innovation

Methods, ideas, or system contributions that make the work stand out.

geographically weighted regression
compositional data
spatial heterogeneity
pairwise fusion penalty
nonconvex penalty