Online Survival Analysis: A Bandit Approach under Cox PH Model

📅 2026-04-22

📈 Citations: 0

✨ Influential: 0

career value

206K/year

🤖 AI Summary

This study addresses the challenges of staggered enrollment, delayed feedback, and right censoring in online sequential treatment decision-making by proposing a novel approach that integrates the Cox proportional hazards model with the multi-armed bandit framework. The method combines semiparametric survival analysis with online bandit learning and is adaptable to three classical bandit algorithms, enabling dynamic optimization of treatment strategies under continuously arriving censored time-to-event data. It comes with theoretical guarantees of sublinear regret bounds. Extensive simulations and semi-synthetic experiments based on real-world SEER cancer registry data demonstrate that the proposed approach efficiently learns near-optimal treatment policies and significantly improves decision-making performance.

Technology Category

Application Category

📝 Abstract

Survival analysis is a widely used statistical framework for modeling time-to-event data under censoring. Classical methods, such as the Cox proportional hazards (Cox PH) model, offer a semiparametric approach to estimating the effects of covariates on the hazard function. Despite its importance, survival analysis has been largely unexplored in online settings, particularly within the bandit framework, where decisions must be made sequentially to optimize treatments as new data arrive over time. In this work, we take an initial step toward integrating survival analysis into a purely online learning setting under the Cox PH model, addressing key challenges including staggered entry, delayed feedback, and right censoring. We adapt three canonical bandit algorithms to balance exploration and exploitation, with theoretical guarantees of sublinear regret bounds. Extensive simulations and semi-real experiments using SEER cancer data demonstrate that our approach enables rapid and effective learning of near-optimal treatment policies.

Problem

Research questions and friction points this paper is trying to address.

online survival analysis

bandit algorithms

Cox proportional hazards model

delayed feedback

right censoring

Innovation

Methods, ideas, or system contributions that make the work stand out.

online survival analysis

bandit algorithms

Cox proportional hazards model