🤖 AI Summary
This study addresses the challenge in digital therapeutics where patient adherence is endogenous to treatment recommendations and evolves dynamically over time. The authors propose an online learning framework that jointly models adherence feedback and intervention decisions. Innovatively treating adherence as an endogenous state variable influenced by recommendations, they formulate a linear dynamical system with finite-time identifiability and develop the UCB-BOLD algorithm to achieve sublinear regret. Simulation experiments based on micro-randomized trials demonstrate that, on synthetic patient cohorts, the proposed method reduces conditional value-at-risk regret by a factor of 2–3 compared to the best-performing baseline, substantially improving the efficiency of long-term resource allocation in adaptive interventions.
📝 Abstract
A critical challenge facing clinicians managing chronic disease interventions is sustaining long-run patient health given limited information and resources. Digital therapeutics (DTs) provide a cost-effective way to manage interventions at scale through repeated interactions (e.g. daily treatment recommendations), but patient success is highly dependent on their adherence. Behavioral psychology suggests that both treatment recommendations and past adherence affect future adherence, yet existing decision support frameworks for DTs model only recommendation effects or treat adherence as exogenous context, leaving a key gap in model and algorithm development. To address this gap, we present a DT decision support framework that captures both recommendation and adherence effects, allowing clinicians to better plan treatment recommendations. We model a patient's time-varying capacity for engagement with treatment using a linear dynamical system (LDS) that captures both recommendation and adherence effects, endogenously connected to adherence behavior with a logit link. We establish finite-time identification guarantees for this model, extending LDS results to our setting. Next, we propose an optimism-based algorithm, UCB-BOLD, for online treatment selection and prove that it achieves sublinear regret. We evaluate UCB-BOLD against benchmarks via ablation studies on a synthetic patient cohort generated using micro-randomized trial data. DT decision support tools can include dynamical models to enable decision makers to efficiently use the data in DT settings to improve patient health through effective resource allocation. While myopic or heuristic approaches suffice for some patient types, the benefits of explicitly planning around recommendation and adherence effects are significant for others; UCB-BOLD achieves 2-3x lower conditional value-at-risk regret than the next-best benchmark.