Topic-informed dynamic mixture model for occupational heterogeneity in health risk behaviors

📅 2025-12-23
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This study investigates how occupational environments heterogeneously moderate the sociodemographic determinants of four key health risk behaviors—smoking, unhealthy diet, hazardous alcohol use, and physical inactivity (collectively termed SNAP). To address this, we propose a novel dynamic mixed-ordered probit model that uniquely integrates structural topic modeling (STM) with nonlocal spike-and-slab priors to enhance variable selection performance and parameter interpretability. Further, we embed a sequential Monte Carlo online learning framework to enable real-time model updating. Applied to Italy’s PASSI surveillance survey data, the model performs adaptive parameter estimation at the occupational subgroup level, substantially improving attribution accuracy for SNAP behaviors. The framework advances public health analytics by enabling scalable identification of high-risk occupational subpopulations and supporting the design of targeted, evidence-based interventions.

Technology Category

Application Category

📝 Abstract
Behavioral risk factors, i.e., smoking, poor nutrition, alcohol misuse, and physical inactivity (SNAP), are leading contributors to chronic diseases and healthcare costs worldwide. Their prevalence is shaped %not only by demographic characteristics %but and also by contextual ones such as socioeconomic and occupational environments. In this study, we leverage data from the Italian health and behavioral surveillance system PASSI to model SNAP behaviors through a Bayesian framework that integrates textual information on occupations. We use Structural Topic Modeling (STM) to cluster free-text job descriptions into latent occupational groups, which inform mixture weights in a multivariate ordered probit model. Covariate effects are allowed to vary across occupational clusters and evolve over time. To enhance interpretability and variable selection, we impose non-local spike-and-slab priors on regression coefficients. Finally, an online learning algorithm based on sequential Monte Carlo enables efficient updating as new data become available. This dynamic, scalable, and interpretable approach permits observing how occupational contexts modulate the impact of socio-demographic factors on health behaviors, providing valuable insights for targeted public health interventions.
Problem

Research questions and friction points this paper is trying to address.

Modeling SNAP health behaviors using occupational text data
Allowing covariate effects to vary across occupational clusters over time
Providing interpretable insights for targeted public health interventions
Innovation

Methods, ideas, or system contributions that make the work stand out.

Bayesian framework with Structural Topic Modeling for occupations
Multivariate ordered probit model with time-varying covariate effects
Online learning via sequential Monte Carlo for dynamic updates
🔎 Similar Papers
No similar papers found.
L
Lorenzo Schiavon
Department of Economics, Ca’ Foscari University of Venice, Venice, Italy
M
Mattia Stival
Department of Economics, Ca’ Foscari University of Venice, Venice, Italy
Angela Andreella
Angela Andreella
Ca' Foscari University of Venice
Multivariate analysisSocial StatisticsPsychometricsHigh-dimensional dataPermutation tests
S
Stefano Campostrini
Department of Economics, Ca’ Foscari University of Venice, Venice, Italy