User Perception of Attention Visualizations: Effects on Interpretability Across Evidence-Based Medical Documents

📅 2025-08-05
📈 Citations: 0
✹ Influential: 0
📄 PDF
đŸ€– AI Summary
This study investigates whether attention weights can serve as credible explanations for evidence-based medicine literature classification and examines how different visualization modalities affect clinicians’ perceived interpretability. Method: We employed XLNet for classification, extracted inter-layer attention weights, and designed multiple visualization schemes—including text highlighting, background color mapping, and heatmap overlays—followed by a user study with clinical experts. Contribution/Results: Although the model achieved high classification accuracy, attention weights alone were not broadly perceived as effective explanations. Crucially, visualization modality significantly moderated clinicians’ trust and comprehension: intuitive encodings (e.g., brightness or background color) substantially outperformed abstract heatmaps. This challenges conventional design paradigms in explainable AI visualization. To our knowledge, this is the first empirical study in medical NLP demonstrating that the *form* of visualization—not the attention mechanism itself—is the primary determinant of human acceptance of AI decisions.

Technology Category

Application Category

📝 Abstract
The attention mechanism is a core component of the Transformer architecture. Beyond improving performance, attention has been proposed as a mechanism for explainability via attention weights, which are associated with input features (e.g., tokens in a document). In this context, larger attention weights may imply more relevant features for the model's prediction. In evidence-based medicine, such explanations could support physicians' understanding and interaction with AI systems used to categorize biomedical literature. However, there is still no consensus on whether attention weights provide helpful explanations. Moreover, little research has explored how visualizing attention affects its usefulness as an explanation aid. To bridge this gap, we conducted a user study to evaluate whether attention-based explanations support users in biomedical document classification and whether there is a preferred way to visualize them. The study involved medical experts from various disciplines who classified articles based on study design (e.g., systematic reviews, broad synthesis, randomized and non-randomized trials). Our findings show that the Transformer model (XLNet) classified documents accurately; however, the attention weights were not perceived as particularly helpful for explaining the predictions. However, this perception varied significantly depending on how attention was visualized. Contrary to Munzner's principle of visual effectiveness, which favors precise encodings like bar length, users preferred more intuitive formats, such as text brightness or background color. While our results do not confirm the overall utility of attention weights for explanation, they suggest that their perceived helpfulness is influenced by how they are visually presented.
Problem

Research questions and friction points this paper is trying to address.

Evaluating if attention weights aid biomedical document classification
Assessing impact of visualization formats on explanation usefulness
Determining medical experts' preference for attention visualization methods
Innovation

Methods, ideas, or system contributions that make the work stand out.

Transformer model XLNet for document classification
Attention weights as explainability mechanism
Visualization affects perceived explanation usefulness
🔎 Similar Papers
No similar papers found.