🤖 AI Summary
In clinical survival analysis, integrating external risk models with local data containing novel covariates remains challenging for improving prediction performance.
Method: This paper proposes CARE (Combined Adaptive Risk Estimation), a convex-combination estimator that fuses an external risk estimator with kernel smoothing within a penalized partial likelihood framework in a reproducing kernel Hilbert space. The convex combination weight and regularization parameter are jointly selected via cross-validation.
Contribution/Results: CARE achieves a convergence rate no slower than that of the optimal kernel estimator or the best-performing external model, as rigorously established theoretically. Extensive simulations and real-world cardiovascular disease data demonstrate that CARE significantly enhances predictive accuracy and out-of-sample generalization over competing methods. To facilitate reproducibility and adoption, the method is publicly released as an open-source Python package, *care-survival*.
📝 Abstract
Clinical risk prediction models are regularly updated as new data, often with additional covariates, become available. We propose CARE (Convex Aggregation of relative Risk Estimators) as a general approach for combining existing "external" estimators with a new data set in a time-to-event survival analysis setting. Our method initially employs the new data to fit a flexible family of reproducing kernel estimators via penalised partial likelihood maximisation. The final relative risk estimator is then constructed as a convex combination of the kernel and external estimators, with the convex combination coefficients and regularisation parameters selected using cross-validation. We establish high-probability bounds for the $L_2$-error of our proposed aggregated estimator, showing that it achieves a rate of convergence that is at least as good as both the optimal kernel estimator and the best external model. Empirical results from simulation studies align with the theoretical results, and we illustrate the improvements our methods provide for cardiovascular disease risk modelling. Our methodology is implemented in the Python package care-survival.