Over-Relying on Reliance: Towards Realistic Evaluations of AI-Based Clinical Decision Support

📅 2025-04-10

📈 Citations: 0

✨ Influential: 0

career value

181K/year

🤖 AI Summary

Current evaluations of AI-based Clinical Decision Support (AI-CDS) systems over-rely on static metrics—such as trust, acceptance, reliance, and isolated AI task performance—failing to capture the dynamic, context-sensitive nature of human-AI collaboration in real clinical practice, thereby suffering from low ecological validity. Method: Drawing on interdisciplinary insights from Human-Computer Interaction (HCI) and medical AI, this study employs critical literature analysis, reflective design research, and situated evaluation methods to systematically deconstruct limitations of prevailing assessment paradigms. Contribution/Results: We propose a high-ecological-validity evaluation framework centered on actual clinical benefit and clinicians’ adaptive usage strategies—notably moving beyond reductive “human-AI collaboration trap” models. The framework foregrounds emergent clinical value co-produced by AI within authentic clinical workflows. This work establishes both a theoretical foundation and actionable methodological pathways for developing next-generation AI-CDS evaluation standards.

Technology Category

Application Category

📝 Abstract

As AI-based clinical decision support (AI-CDS) is introduced in more and more aspects of healthcare services, HCI research plays an increasingly important role in designing for complementarity between AI and clinicians. However, current evaluations of AI-CDS often fail to capture when AI is and is not useful to clinicians. This position paper reflects on our work and influential AI-CDS literature to advocate for moving beyond evaluation metrics like Trust, Reliance, Acceptance, and Performance on the AI's task (what we term the"trap"of human-AI collaboration). Although these metrics can be meaningful in some simple scenarios, we argue that optimizing for them ignores important ways that AI falls short of clinical benefit, as well as ways that clinicians successfully use AI. As the fields of HCI and AI in healthcare develop new ways to design and evaluate CDS tools, we call on the community to prioritize ecologically valid, domain-appropriate study setups that measure the emergent forms of value that AI can bring to healthcare professionals.

Problem

Research questions and friction points this paper is trying to address.

Evaluating AI-CDS usefulness in realistic clinical settings

Moving beyond simplistic metrics like Trust and Reliance

Prioritizing ecologically valid study setups for AI-CDS

Innovation

Methods, ideas, or system contributions that make the work stand out.

Advocating ecologically valid study setups

Moving beyond traditional evaluation metrics

Prioritizing domain-appropriate AI-CDS designs

🔎 Similar Papers

A Survey of AI Reliance