When Bias Backfires: The Modulatory Role of Counterfactual Explanations on the Adoption of Algorithmic Bias in XAI-Supported Human Decision-Making

📅 2025-05-20

📈 Citations: 0

✨ Influential: 0

career value

188K/year

🤖 AI Summary

This study investigates whether counterfactual explanations can mitigate the internalization of AI-induced bias by human decision-makers. Method: A behavioral experiment simulating a hiring scenario examined the temporal effects of gender-biased AI recommendations on human judgments, comparing conditions with and without counterfactual explanations. Contribution/Results: (1) Counterfactual explanations were first empirically shown to reverse AI-induced preference shifts—not merely enhance bias awareness; (2) Poorly designed XAI exacerbated bias internalization; (3) Only 8 of 294 participants (2.7%) detected bias despite a 70% compliance rate, revealing strong implicit influence; (4) Without explanations, bias persisted post-interaction, whereas explained conditions exhibited significant reversal; (5) Trust showed no significant difference, but confidence dynamics reflected active cognitive recalibration. These findings provide novel empirical evidence for XAI efficacy evaluation and bias mitigation, offering actionable design principles for responsible AI deployment.

Technology Category

Application Category

📝 Abstract

Although the integration of artificial intelligence (AI) into everyday tasks improves efficiency and objectivity, it also risks transmitting bias to human decision-making. In this study, we conducted a controlled experiment that simulated hiring decisions to examine how biased AI recommendations - augmented with or without counterfactual explanations - influence human judgment over time. Participants, acting as hiring managers, completed 60 decision trials divided into a baseline phase without AI, followed by a phase with biased (X)AI recommendations (favoring either male or female candidates), and a final post-interaction phase without AI. Our results indicate that the participants followed the AI recommendations 70% of the time when the qualifications of the given candidates were comparable. Yet, only a fraction of participants detected the gender bias (8 out of 294). Crucially, exposure to biased AI altered participants' inherent preferences: in the post-interaction phase, participants' independent decisions aligned with the bias when no counterfactual explanations were provided before, but reversed the bias when explanations were given. Reported trust did not differ significantly across conditions. Confidence varied throughout the study phases after exposure to male-biased AI, indicating nuanced effects of AI bias on decision certainty. Our findings point to the importance of calibrating XAI to avoid unintended behavioral shifts in order to safeguard equitable decision-making and prevent the adoption of algorithmic bias.

Problem

Research questions and friction points this paper is trying to address.

How biased AI recommendations influence human decision-making over time

Effect of counterfactual explanations on reversing AI-induced bias adoption

Impact of XAI calibration on equitable decision-making and bias prevention

Innovation

Methods, ideas, or system contributions that make the work stand out.

Used counterfactual explanations to modulate bias

Simulated hiring decisions with biased AI recommendations

Measured behavioral shifts post AI interaction

🔎 Similar Papers

Contrastive Explanations That Anticipate Human Misconceptions Can Improve Human Decision-Making Skills