Explainable AI as a Double-Edged Sword in Dermatology: The Impact on Clinicians versus The Public

📅 2025-12-13

📈 Citations: 0

✨ Influential: 0

career value

187K/year

🤖 AI Summary

This study investigates the differential impact and potential risks of eXplainable Artificial Intelligence (XAI) in dermatological diagnosis for laypersons versus primary-care clinicians. We integrate a fairness-aware dermatology AI model, a multimodal large language model (LLM)-based explainer, and human-centered experimental design to conduct large-scale user studies, quantifying automation bias and decision fairness. Our key contributions are threefold: (1) We identify, for the first time, a proficiency-dependent bidirectional effect of XAI—significantly increasing public trust while exacerbating automation bias (23% accuracy drop when AI errs), yet improving diagnostic robustness among clinicians; (2) We demonstrate that explanation timing is critical—presenting the AI recommendation before the explanation significantly degrades judgment quality in erroneous cases; (3) We empirically validate that XAI assistance improves cross-skin-tone diagnostic accuracy and reduces demographic performance disparities. These findings provide crucial human factors evidence and actionable design principles for responsible clinical deployment of XAI.

Technology Category

Application Category

📝 Abstract

Artificial intelligence (AI) is increasingly permeating healthcare, from physician assistants to consumer applications. Since AI algorithm's opacity challenges human interaction, explainable AI (XAI) addresses this by providing AI decision-making insight, but evidence suggests XAI can paradoxically induce over-reliance or bias. We present results from two large-scale experiments (623 lay people; 153 primary care physicians, PCPs) combining a fairness-based diagnosis AI model and different XAI explanations to examine how XAI assistance, particularly multimodal large language models (LLMs), influences diagnostic performance. AI assistance balanced across skin tones improved accuracy and reduced diagnostic disparities. However, LLM explanations yielded divergent effects: lay users showed higher automation bias - accuracy boosted when AI was correct, reduced when AI erred - while experienced PCPs remained resilient, benefiting irrespective of AI accuracy. Presenting AI suggestions first also led to worse outcomes when the AI was incorrect for both groups. These findings highlight XAI's varying impact based on expertise and timing, underscoring LLMs as a "double-edged sword" in medical AI and informing future human-AI collaborative system design.

Problem

Research questions and friction points this paper is trying to address.

Investigates XAI's dual impact on clinician and public diagnostic accuracy

Examines how LLM explanations affect automation bias across expertise levels

Assesses timing of AI suggestions on diagnostic outcomes in dermatology

Innovation

Methods, ideas, or system contributions that make the work stand out.

Combining fairness-based AI with multimodal LLM explanations

Testing XAI impact on clinicians versus public via large-scale experiments

Analyzing timing effects of AI suggestions on diagnostic outcomes

🔎 Similar Papers

No similar papers found.