Your Model Is Unfair, Are You Even Aware? Inverse Relationship Between Comprehension and Trust in Explainability Visualizations of Biased ML Models

📅 2025-07-31

📈 Citations: 0

✨ Influential: 0

career value

213K/year

🤖 AI Summary

This study investigates how explanatory visualizations influence non-expert users’ understanding, perception, and trust in machine learning model bias. Method: A controlled experiment evaluates five widely adopted XAI tools—LIME, SHAP, Counterfactual Perturbation (CP), Anchors, and ELI5—using a purpose-built design taxonomy to dissect their mechanistic effects on user cognition and attitudes. Contribution/Results: We present the first empirical evidence of an inverse causal relationship: improved model understanding significantly *reduces* user trust (p < 0.001), mediated by heightened bias awareness induced by visualization. Crucially, this trust erosion is not inherent to explanation but attributable to design choices—specifically, the salient highlighting of sensitive attributes. Trust can be restored without compromising comprehension through either design interventions (e.g., attenuating emphasis on sensitive features) or improvements in underlying model fairness. These findings advance human-centered XAI design and provide actionable insights for equitable AI governance.

Technology Category

Application Category

📝 Abstract

Systems relying on ML have become ubiquitous, but so has biased behavior within them. Research shows that bias significantly affects stakeholders' trust in systems and how they use them. Further, stakeholders of different backgrounds view and trust the same systems differently. Thus, how ML models' behavior is explained plays a key role in comprehension and trust. We survey explainability visualizations, creating a taxonomy of design characteristics. We conduct user studies to evaluate five state-of-the-art visualization tools (LIME, SHAP, CP, Anchors, and ELI5) for model explainability, measuring how taxonomy characteristics affect comprehension, bias perception, and trust for non-expert ML users. Surprisingly, we find an inverse relationship between comprehension and trust: the better users understand the models, the less they trust them. We investigate the cause and find that this relationship is strongly mediated by bias perception: more comprehensible visualizations increase people's perception of bias, and increased bias perception reduces trust. We confirm this relationship is causal: Manipulating explainability visualizations to control comprehension, bias perception, and trust, we show that visualization design can significantly (p < 0.001) increase comprehension, increase perceived bias, and reduce trust. Conversely, reducing perceived model bias, either by improving model fairness or by adjusting visualization design, significantly increases trust even when comprehension remains high. Our work advances understanding of how comprehension affects trust and systematically investigates visualization's role in facilitating responsible ML applications.

Problem

Research questions and friction points this paper is trying to address.

Investigates how explainability visualizations affect trust in biased ML models

Explores inverse relationship between comprehension and trust via bias perception

Evaluates design impact of visualization tools on non-expert user perceptions

Innovation

Methods, ideas, or system contributions that make the work stand out.

Survey explainability visualizations for biased ML models

Measure comprehension and trust via user studies

Manipulate visualizations to control bias perception

🔎 Similar Papers

No similar papers found.