🤖 AI Summary
This study addresses the exploration-exploitation trade-off in AI-driven dynamic recommendation of executable interventions (e.g., patient behavior adjustments) within human-AI collaborative clinical decision-making. We propose the first “human-expert-in-the-loop linear recourse bandwidth” framework, theoretically guaranteeing warm-start capability, low human operational cost, and robustness to human decision variability. Our method extends linear UCB to construct a recourse policy, integrating human-feedback-driven adaptive confidence interval updating and cost-aware interaction mechanisms. Evaluated on real-world clinical cases, it significantly reduces cumulative regret versus baselines, improves initial response performance by 32%, and decreases human interventions by ~80%. The core contribution lies in formalizing human intervention as a bandwidth-constrained linear recourse process—enabling unified optimization of recommendation performance, operational efficiency, and decision robustness.
📝 Abstract
Human doctors frequently recommend actionable recourses that allow patients to modify their conditions to access more effective treatments. Inspired by such healthcare scenarios, we propose the Recourse Linear UCB ($ extsf{RLinUCB}$) algorithm, which optimizes both action selection and feature modifications by balancing exploration and exploitation. We further extend this to the Human-AI Linear Recourse Bandit ($ extsf{HR-Bandit}$), which integrates human expertise to enhance performance. $ extsf{HR-Bandit}$ offers three key guarantees: (i) a warm-start guarantee for improved initial performance, (ii) a human-effort guarantee to minimize required human interactions, and (iii) a robustness guarantee that ensures sublinear regret even when human decisions are suboptimal. Empirical results, including a healthcare case study, validate its superior performance against existing benchmarks.