Learning to Select Like Humans: Explainable Active Learning for Medical Imaging

📅 2026-02-10

📈 Citations: 0

✨ Influential: 0

career value

197K/year

🤖 AI Summary

This work proposes an explainability-guided active learning framework for medical image analysis, addressing the limitation of conventional approaches that select samples solely based on prediction uncertainty while neglecting whether the model attends to clinically relevant regions. The proposed method innovatively integrates classification uncertainty with a dual-criterion sample selection mechanism that incorporates Grad-CAM attention maps and their Dice similarity to expert-annotated regions, ensuring the model focuses on diagnostically meaningful features. Evaluated on BraTS, VinDr-CXR, and SIIM-COVID-19 datasets, the framework achieves competitive performance—77.22%, 52.37%, and 52.66% accuracy, respectively—using only 570 labeled samples, substantially outperforming random sampling. Visual analyses further confirm a high alignment between the model’s attention and clinician-annotated regions of interest, enhancing both reliability and interpretability.

Technology Category

Application Category

📝 Abstract

Medical image analysis requires substantial labeled data for model training, yet expert annotation is expensive and time-consuming. Active learning (AL) addresses this challenge by strategically selecting the most informative samples for the annotation purpose, but traditional methods solely rely on predictive uncertainty while ignoring whether models learn from clinically meaningful features a critical requirement for clinical deployment. We propose an explainability-guided active learning framework that integrates spatial attention alignment into a sample acquisition process. Our approach advocates for a dual-criterion selection strategy combining: (i) classification uncertainty to identify informative examples, and (ii) attention misalignment with radiologist-defined regions-of-interest (ROIs) to target samples where the model focuses on incorrect features. By measuring misalignment between Grad-CAM attention maps and expert annotations using \emph{Dice similarity}, our acquisition function judiciously identifies samples that enhance both predictive performance and spatial interpretability. We evaluate the framework using three expert-annotated medical imaging datasets, namely, BraTS (MRI brain tumors), VinDr-CXR (chest X-rays), and SIIM-COVID-19 (chest X-rays). Using only 570 strategically selected samples, our explainability-guided approach consistently outperforms random sampling across all the datasets, achieving 77.22\% accuracy on BraTS, 52.37\% on VinDr-CXR, and 52.66\% on SIIM-COVID. Grad-CAM visualizations confirm that the models trained by our dual-criterion selection focus on diagnostically relevant regions, demonstrating that incorporating explanation guidance into sample acquisition yields superior data efficiency while maintaining clinical interpretability.

Problem

Research questions and friction points this paper is trying to address.

active learning

medical imaging

explainability

sample selection

clinical interpretability

Innovation

Methods, ideas, or system contributions that make the work stand out.

Explainable Active Learning

Attention Alignment

Grad-CAM