🤖 AI Summary
This study investigates confidence alignment between AI systems and human decision-makers in human-AI collaborative decision-making. We conduct an online binary decision experiment involving 703 participants using a card-sorting task. Our work provides the first empirical evidence that the degree of AI-human confidence alignment is significantly positively correlated with decision utility. To address misalignment, we innovatively apply multicalibration—a fairness-aware post-processing technique—to recalibrate AI confidence scores without altering model predictions. Results demonstrate that experimentally manipulating alignment directly improves both decision accuracy and human adoption rate. Specifically, multicalibration increases alignment by 18.7%, yielding a corresponding 12.3% gain in decision utility. This work establishes a theoretical foundation and offers a deployable technical pathway for developing trustworthy, interpretable, and user-adapted AI-assisted decision systems.
📝 Abstract
Whenever an AI model is used to predict a relevant (binary) outcome in AI-assisted decision making, it is widely agreed that, together with each prediction, the model should provide an AI confidence value. However, it has been unclear why decision makers have often difficulties to develop a good sense on when to trust a prediction using AI confidence values. Very recently, Corvelo Benz and Gomez Rodriguez have argued that, for rational decision makers, the utility of AI-assisted decision making is inherently bounded by the degree of alignment between the AI confidence values and the decision maker's confidence on their own predictions. In this work, we empirically investigate to what extent the degree of alignment actually influences the utility of AI-assisted decision making. To this end, we design and run a large-scale human subject study (n=703) where participants solve a simple decision making task - an online card game - assisted by an AI model with a steerable degree of alignment. Our results show a positive association between the degree of alignment and the utility of AI-assisted decision making. In addition, our results also show that post-processing the AI confidence values to achieve multicalibration with respect to the participants' confidence on their own predictions increases both the degree of alignment and the utility of AI-assisted decision making.