🤖 AI Summary
This study addresses the challenge that in high-stakes binary decisions, humans struggle to reliably calibrate trust in AI predictions based solely on AI confidence scores, thereby limiting the effectiveness of AI assistance. The authors formulate this issue within a two-armed online contextual bandit framework with full feedback and establish, for the first time, a theoretical connection between the degree of alignment between human and AI confidence and the inherent learning complexity. Leveraging tools from online learning theory, regret analysis, and a generalized Dvoretzky–Kiefer–Wolfowitz (DKW) inequality, they prove that under perfect alignment, the expected regret can be reduced to $O(\sqrt{T \log T})$. Human-subject experiments further demonstrate that even under imperfect alignment, improved alignment significantly lowers the lower bound of decision regret and enhances learning efficiency.
📝 Abstract
It is widely agreed that when AI models assist decision-makers in high-stakes domains by predicting an outcome of interest, they should communicate the confidence of their predictions. However, empirical evidence suggests that decision-makers often struggle to determine when to trust a prediction based solely on this communicated confidence. In this context, recent theoretical and empirical work suggests a positive correlation between the utility of AI-assisted decision-making and the degree of alignment between the AI confidence and the decision-makers' confidence in their own predictions. Crucially, these findings do not yet elucidate the extent to which this alignment influences the complexity of learning to make optimal decisions through repeated interactions. In this paper, we address this question in the canonical case of binary predictions and binary decisions. We first show that this problem is equivalent to a two-armed online contextual learning problem with full feedback, and establish a lower bound of $Ω(\sqrt{|H| \cdot |B| \cdot T} )$ on the expected regret any learner can attain, where $H$ and $B$ denote the sets of human and AI confidence values. We then demonstrate that, under perfect alignment between AI and human confidence, a learner can attain an expected regret of $O(\sqrt{|H| \cdot T\log T})$ and, when $\sqrt{|H|} = O(\log T)$ and $B$ is countable, a non-trivial generalization of the Dvoretzky-Kiefer-Wolfowitz inequality improves the regret bound to $O(\sqrt{T\log T})$. Taken together, these results reveal that alignment can reduce the complexity of learning to make decisions with AI assistance. Experiments on real data from two different human-subject studies where participants solve simple decision-making tasks assisted by AI models show that our theoretical results are robust to violations of perfect alignment.