Time-to-Event Modeling with Pseudo-Observations in Federated Settings

📅 2025-07-28
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address the challenge of patient privacy constraints preventing centralized sharing of individual survival data in multicenter clinical studies, this paper proposes a non-iterative, one-shot federated survival analysis framework. The method leverages distributed pseudo-observations, integrating sequentially updated Kaplan–Meier estimates with renewable generalized linear models. It introduces an adaptive soft-thresholding debiasing mechanism to correct for site-level heterogeneity and supports both proportional and non-proportional hazards modeling. The framework accommodates both time-invariant and time-varying covariates, enabling privacy-preserving estimation of global survival probabilities and inference on covariate effects. Experiments on simulated data and a real-world pediatric obesity cohort demonstrate performance comparable to centralized Cox regression and ODAC models; local site estimates align closely with the global result, validating the method’s effectiveness, robustness, and practical utility.

Technology Category

Application Category

📝 Abstract
In multi-center clinical studies, concerns about patient privacy often prohibit pool- ing individual-level time-to-event data. We propose a non-iterative, one-shot federated framework using distributed pseudo-observations, derived from a sequentially updated Kaplan-Meier estimator and fitted with renewable generalized linear models. This framework enables the estimation of survival probabilities at specified landmark times and accommodates both time-invariant and time-varying covariate effects. To cap- ture site-level heterogeneity, we introduce a soft-thresholding debiasing procedure that adaptively shrinks local estimates toward the global fit. Through extensive simula- tions across varying event rates and site-size distributions, our method demonstrates performance comparable to pooled Cox and the one-shot Optimal Distributed Aggre- gation (ODAC) models, with added flexibility to capture non-proportional hazards. Applied to pediatric obesity data from the Chicago Area Patient-Centered Outcomes Research Network (CAPriCORN), which comprises four different sites and includes a total of 45,865 patients. The federated pseudo value regression model produced esti- mates of both time-constant and time-varying hazard ratios that closely aligned with those obtained from the pooled analysis, demonstrating its utility as a robust and privacy-preserving alternative for collaborative survival research. To further address potential heterogeneity across sites, we applied a covariate-wise debiasing algorithm, enabling site-level adjustments while preserving consistency with the global model.
Problem

Research questions and friction points this paper is trying to address.

Estimating survival probabilities without pooling patient data
Handling time-invariant and time-varying covariate effects
Addressing site-level heterogeneity in federated settings
Innovation

Methods, ideas, or system contributions that make the work stand out.

Non-iterative federated framework with pseudo-observations
Soft-thresholding debiasing for site-level heterogeneity
Covariate-wise debiasing algorithm for global consistency
🔎 Similar Papers
No similar papers found.
H
Hyojung Jang
Division of Biostatistics, Department of Preventive Medicine, Northwestern University
Malcolm Risk
Malcolm Risk
University of Michigan
Biostatistics
Y
Yaojie Wang
Department of Preventive Medicine, Northwestern University
N
Norrina Bai Allen
Department of Preventive Medicine, Northwestern University
Xu Shi
Xu Shi
University of Michigan
Electronic Health RecordCausal InferenceNegative ControlMachine Translation
L
Lili Zhao
Division of Biostatistics, Department of Preventive Medicine, Northwestern University