Effect of Data Augmentation on Conformal Prediction for Diabetic Retinopathy

📅 2025-08-19
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This study investigates how data augmentation strategies affect the performance of conformal prediction in diabetic retinopathy (DR) grading, specifically evaluating their impact on key uncertainty quantification metrics—empirical coverage, prediction set size, and efficiency. Using the DDR dataset and backbone models ResNet-50 and CoaT, we systematically assess five augmentation strategies: no augmentation, geometric transformations, CLAHE, Mixup, and CutMix. Results show that Mixup and CutMix significantly improve both coverage reliability and predictive efficiency, whereas CLAHE may impair confidence calibration. Notably, this work is the first to demonstrate that mixed augmentations not only enhance classification accuracy but also jointly optimize the statistical validity of conformal prediction. These findings provide empirical evidence and methodological guidance for the co-design of augmentation techniques and uncertainty quantification in trustworthy medical AI systems.

Technology Category

Application Category

📝 Abstract
The clinical deployment of deep learning models for high-stakes tasks such as diabetic retinopathy (DR) grading requires demonstrable reliability. While models achieve high accuracy, their clinical utility is limited by a lack of robust uncertainty quantification. Conformal prediction (CP) offers a distribution-free framework to generate prediction sets with statistical guarantees of coverage. However, the interaction between standard training practices like data augmentation and the validity of these guarantees is not well understood. In this study, we systematically investigate how different data augmentation strategies affect the performance of conformal predictors for DR grading. Using the DDR dataset, we evaluate two backbone architectures -- ResNet-50 and a Co-Scale Conv-Attentional Transformer (CoaT) -- trained under five augmentation regimes: no augmentation, standard geometric transforms, CLAHE, Mixup, and CutMix. We analyze the downstream effects on conformal metrics, including empirical coverage, average prediction set size, and correct efficiency. Our results demonstrate that sample-mixing strategies like Mixup and CutMix not only improve predictive accuracy but also yield more reliable and efficient uncertainty estimates. Conversely, methods like CLAHE can negatively impact model certainty. These findings highlight the need to co-design augmentation strategies with downstream uncertainty quantification in mind to build genuinely trustworthy AI systems for medical imaging.
Problem

Research questions and friction points this paper is trying to address.

How data augmentation affects conformal prediction reliability
Interaction between augmentation strategies and uncertainty guarantees
Impact of augmentation on diabetic retinopathy grading certainty
Innovation

Methods, ideas, or system contributions that make the work stand out.

Data augmentation strategies for conformal prediction
Mixup and CutMix improve uncertainty reliability
CLAHE negatively impacts model certainty metrics
🔎 Similar Papers
No similar papers found.
R
Rizwan Ahamed
West Virginia University, Morgantown, WV 26506, USA
A
Annahita Amireskandari
West Virginia University, Morgantown, WV 26506, USA
Joel Palko
Joel Palko
West Virginia University
Ophthalmology
C
Carol Laxson
West Virginia University, Morgantown, WV 26506, USA
Binod Bhattarai
Binod Bhattarai
Assistant Professor, University of Aberdeen
Machine LearningMedical Image AnalysisComputer Vision
P
Prashnna Gyawali
West Virginia University, Morgantown, WV 26506, USA