Beware of"Explanations"of AI

📅 2025-04-09

📈 Citations: 0

✨ Influential: 0

career value

201K/year

🤖 AI Summary

This paper addresses the misconception in eXplainable Artificial Intelligence (XAI) that “explanation equals safety,” systematically exposing risks arising from poor explanation quality—including eroded trust, erroneous decision-making, privacy violations, and covert manipulation. Methodologically, it introduces a critical framework for explanation quality anchored in “goal–stakeholder–context,” moving beyond techno-centric paradigms to foreground psychological alignment within socio-technical realities. Integrating cognitive psychology (mental models theory), human factors engineering, AI ethics, and empirical case studies, the study identifies four distinct harm patterns: decision misguidance, privacy infringement, implicit manipulation, and trust backlash. The findings provide a cross-disciplinary theoretical foundation and actionable warnings for developing robust XAI evaluation standards, refining regulatory guidelines, and enabling responsible AI deployment.

Technology Category

Application Category

📝 Abstract

Understanding the decisions made and actions taken by increasingly complex AI system remains a key challenge. This has led to an expanding field of research in explainable artificial intelligence (XAI), highlighting the potential of explanations to enhance trust, support adoption, and meet regulatory standards. However, the question of what constitutes a"good"explanation is dependent on the goals, stakeholders, and context. At a high level, psychological insights such as the concept of mental model alignment can offer guidance, but success in practice is challenging due to social and technical factors. As a result of this ill-defined nature of the problem, explanations can be of poor quality (e.g. unfaithful, irrelevant, or incoherent), potentially leading to substantial risks. Instead of fostering trust and safety, poorly designed explanations can actually cause harm, including wrong decisions, privacy violations, manipulation, and even reduced AI adoption. Therefore, we caution stakeholders to beware of explanations of AI: while they can be vital, they are not automatically a remedy for transparency or responsible AI adoption, and their misuse or limitations can exacerbate harm. Attention to these caveats can help guide future research to improve the quality and impact of AI explanations.

Problem

Research questions and friction points this paper is trying to address.

Understanding complex AI decisions remains challenging

Defining 'good' explanations depends on goals and context

Poor explanations risk harm and reduced AI adoption

Innovation

Methods, ideas, or system contributions that make the work stand out.

Focus on explainable AI (XAI) for transparency

Align mental models to improve explanation quality

Address risks of poor AI explanations proactively

🔎 Similar Papers

No similar papers found.