FunnyNodules: A Customizable Medical Dataset Tailored for Evaluating Explainable AI

📅 2025-11-19
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Existing medical imaging datasets lack dense, diagnosis-reasoning–oriented attribute annotations, severely hindering the development and evaluation of eXplainable AI (xAI) models. Method: We propose the first parameterized synthetic medical image dataset framework designed specifically for xAI, using controllably generated abstract lung nodule–like shapes to precisely model visual attributes—including shape, edge sharpness, and spiculation—and their explicit, rule-based mappings to diagnostic labels. Contribution/Results: The framework enables full-dimensional, decoupled control over diagnostic logic, attribute composition, and data complexity, supporting model-agnostic, attribute-level attribution evaluation and attention-alignment analysis. Experiments demonstrate its capability to accurately determine whether models base decisions on semantically correct features. It provides a reproducible, scalable benchmark for evaluating diverse xAI methods—enabling rigorous, fine-grained assessment of interpretability mechanisms in medical AI.

Technology Category

Application Category

📝 Abstract
Densely annotated medical image datasets that capture not only diagnostic labels but also the underlying reasoning behind these diagnoses are scarce. Such reasoning-related annotations are essential for developing and evaluating explainable AI (xAI) models that reason similarly to radiologists: making correct predictions for the right reasons. To address this gap, we introduce FunnyNodules, a fully parameterized synthetic dataset designed for systematic analysis of attribute-based reasoning in medical AI models. The dataset generates abstract, lung nodule-like shapes with controllable visual attributes such as roundness, margin sharpness, and spiculation. Target class is derived from a predefined attribute combination, allowing full control over the decision rule that links attributes to the diagnostic class. We demonstrate how FunnyNodules can be used in model-agnostic evaluations to assess whether models learn correct attribute-target relations, to interpret over- or underperformance in attribute prediction, and to analyze attention alignment with attribute-specific regions of interest. The framework is fully customizable, supporting variations in dataset complexity, target definitions, class balance, and beyond. With complete ground truth information, FunnyNodules provides a versatile foundation for developing, benchmarking, and conducting in-depth analyses of explainable AI methods in medical image analysis.
Problem

Research questions and friction points this paper is trying to address.

Addressing scarcity of medical datasets with diagnostic reasoning annotations
Evaluating explainable AI models' attribute-based reasoning capabilities
Providing customizable synthetic data for systematic xAI method analysis
Innovation

Methods, ideas, or system contributions that make the work stand out.

Synthetic dataset with controllable visual attributes
Model-agnostic evaluation of attribute-target relations
Customizable framework supporting dataset complexity variations
🔎 Similar Papers
No similar papers found.
L
Luisa Gall'ee
Experimental Radiology, Ulm University Medical Center, Ulm, Germany
Y
Yiheng Xiong
Experimental Radiology, Ulm University Medical Center, Ulm, Germany
M
Meinrad Beer
Department of Diagnostic and Interventional Radiology, Ulm University Medical Center, Ulm, Germany
Michael Götz
Michael Götz
Junior Professor, Section Experimental Radiology, University Hospital Ulm
Machine LearningPersonalized MedicineRadiomicsTransfer LearningMedical Image Analysis