Statistical methods for clustered competing risk data when the event types are only available in a training dataset

📅 2025-05-04
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This paper addresses clustered competing-risks data where event-type labels are missing in the main study and available only in a labeled training dataset. Method: We propose a cause-specific proportional hazards model with random effects, uniquely embedding predicted probabilities from a supervised classification model into the competing-risks framework. Event-type probability-weighted imputation and weighted penalized partial likelihood estimation enable testable, bias-controlled transfer of supervision from the training set to the main study. Analytic variance is rigorously derived to ensure statistical compatibility under cluster dependence. Results: Monte Carlo simulations confirm unbiased estimation, accurate standard error estimation, and nominal coverage of confidence intervals. Applied to hearing protection research, the method successfully uncovers distinct associations between tinnitus and metabolic, sensory, and mixed-type hearing loss—demonstrating its practical utility in etiologic differentiation.

Technology Category

Application Category

📝 Abstract
We develop methods to analyze clustered competing risks data when the event types are only available in a training dataset and are missing in the main study. We propose to estimate the exposure effects through the cause-specific proportional hazards frailty model where random effects are introduced into the model to account for the within-cluster correlation. We propose a weighted penalized partial likelihood method where the weights represent the probabilities of the occurrence of events, and the weights can be obtained by fitting a classification model for the event types on the training dataset. Alternatively, we propose an imputation approach where the missing event types are imputed based on the predictions from the classification model. We derive the analytical variances, and evaluate the finite sample properties of our methods in an extensive simulation study. As an illustrative example, we apply our methods to estimate the associations between tinnitus and metabolic, sensory and metabolic+sensory hearing loss in the Conservation of Hearing Study Audiology Assessment Arm.
Problem

Research questions and friction points this paper is trying to address.

Analyze clustered competing risks data with missing event types
Estimate exposure effects using cause-specific proportional hazards
Develop weighted and imputation methods for event type classification
Innovation

Methods, ideas, or system contributions that make the work stand out.

Cause-specific proportional hazards frailty model
Weighted penalized partial likelihood method
Imputation approach for missing event types
🔎 Similar Papers
No similar papers found.