Minimum Data, Maximum Impact: 20 annotated samples for explainable lung nodule classification

📅 2025-08-01
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address the severe scarcity of pathology-oriented visual attribute annotations (e.g., spiculation, lobulation, vacuole) hindering the performance of interpretable models for pulmonary nodule classification, this work proposes a few-shot attribute-conditional diffusion model for medical image synthesis. Leveraging only 20 real annotated CT scans, we construct a diffusion generative model capable of precisely controlling clinically relevant pathological attributes while producing high-fidelity, semantically consistent nodule images. The synthesized data augment training of an interpretable classification model, preserving decision transparency while substantially improving performance: attribute prediction accuracy increases by 13.4%, and benign/malignant classification accuracy improves by 1.8%. To our knowledge, this is the first study to apply ultra-low-shot conditional diffusion modeling to generate radiologist-defined clinical visual attributes. Our approach establishes a novel paradigm for deploying few-shot interpretable AI in medical imaging.

Technology Category

Application Category

📝 Abstract
Classification models that provide human-interpretable explanations enhance clinicians' trust and usability in medical image diagnosis. One research focus is the integration and prediction of pathology-related visual attributes used by radiologists alongside the diagnosis, aligning AI decision-making with clinical reasoning. Radiologists use attributes like shape and texture as established diagnostic criteria and mirroring these in AI decision-making both enhances transparency and enables explicit validation of model outputs. However, the adoption of such models is limited by the scarcity of large-scale medical image datasets annotated with these attributes. To address this challenge, we propose synthesizing attribute-annotated data using a generative model. We enhance the Diffusion Model with attribute conditioning and train it using only 20 attribute-labeled lung nodule samples from the LIDC-IDRI dataset. Incorporating its generated images into the training of an explainable model boosts performance, increasing attribute prediction accuracy by 13.4% and target prediction accuracy by 1.8% compared to training with only the small real attribute-annotated dataset. This work highlights the potential of synthetic data to overcome dataset limitations, enhancing the applicability of explainable models in medical image analysis.
Problem

Research questions and friction points this paper is trying to address.

Lack of large medical datasets with annotated visual attributes
Need for explainable AI models aligning with clinical reasoning
Improving accuracy with minimal real annotated lung nodule samples
Innovation

Methods, ideas, or system contributions that make the work stand out.

Attribute-conditioned Diffusion Model for data synthesis
20 annotated samples for explainable classification
Synthetic data boosts accuracy by 13.4%
🔎 Similar Papers
No similar papers found.
L
Luisa Gallée
Experimental Radiology, Ulm University Medical Center, Ulm, Germany
C
Catharina Silvia Lisson
Department of Diagnostic and Interventional Radiology, Ulm University Medical Center, Ulm, Germany
C
Christoph Gerhard Lisson
Department of Diagnostic and Interventional Radiology, Ulm University Medical Center, Ulm, Germany
D
Daniela Drees
Department of Diagnostic and Interventional Radiology, Ulm University Medical Center, Ulm, Germany
F
Felix Weig
Department of Diagnostic and Interventional Radiology, Ulm University Medical Center, Ulm, Germany
D
Daniel Vogele
Department of Diagnostic and Interventional Radiology, Ulm University Medical Center, Ulm, Germany
M
Meinrad Beer
Department of Diagnostic and Interventional Radiology, Ulm University Medical Center, Ulm, Germany; XAIRAD - Cooperation for Artificial Intelligence in Experimental Radiology, Ulm, Germany
Michael Götz
Michael Götz
Junior Professor, Section Experimental Radiology, University Hospital Ulm
Machine LearningPersonalized MedicineRadiomicsTransfer LearningMedical Image Analysis