Joint Modeling of Longitudinal EHR Data with Shared Random Effects for Informative Visiting and Observation Processes

📅 2026-02-17
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This study addresses bias in association estimation arising from non-random clinic visits (informative visit processes) and selective biomarker measurements (informative observation mechanisms) in longitudinal electronic health records. We propose a unified semiparametric joint modeling framework that captures the dependence among the visit process, biomarker observation process, and longitudinal outcome process through shared subject-specific Gaussian latent variables, thereby integrating both informative visit and informative observation mechanisms within a single model for the first time. A three-stage estimation procedure based on shared random effects is developed, incorporating sequential imputation to handle complex missingness patterns, and the resulting estimator is shown to be consistent and asymptotically normal. Simulations demonstrate unbiased estimation with substantial improvements over existing methods, and application to the All of Us dataset robustly reveals associations between neighborhood socioeconomic status and trajectories of six blood-based biomarkers.

Technology Category

Application Category

📝 Abstract
Longitudinal electronic health record (EHR) data offer opportunities to study biomarker trajectories; however, association estimates-the primary inferential target-from standard models designed for regular observation times may be biased by a two-stage hierarchical missingness mechanism. The first stage is the visiting process (informative presence), where encounters occur at irregular times driven by patient health status; the second is the observation process (informative observation), where biomarkers are selectively measured during visits. To address these mechanisms, we propose a unified semiparametric joint modeling framework that simultaneously characterizes the visiting, biomarker observation, and longitudinal outcome processes. Central to this framework is a shared subject-specific Gaussian latent variable that captures unmeasured frailty and induces dependence across all components. We develop a three-stage estimation procedure and establish the consistency and asymptotic normality of our estimators. We also introduce a sequential procedure that imputes missing biomarkers prior to adjusting for irregular visiting and examine its performance. Simulation results demonstrate that our method yields unbiased estimates under this mechanism, whereas existing approaches can be substantially biased; notably, methods adjusting only for irregular visiting may exhibit even greater bias than those ignoring both mechanisms. We apply our framework to data from the All of Us Research Program to investigate associations between neighborhood-level socioeconomic status indicators and six blood-based biomarker trajectories, providing a robust tool for outpatient settings where irregular monitoring and selective measurement are prevalent.
Problem

Research questions and friction points this paper is trying to address.

informative visiting
informative observation
longitudinal EHR data
missingness mechanism
biomarker trajectories
Innovation

Methods, ideas, or system contributions that make the work stand out.

joint modeling
informative visiting
informative observation
shared random effects
semiparametric framework
🔎 Similar Papers
No similar papers found.