Bridging Restoration and Diagnosis: A Comprehensive Benchmark for Retinal Fundus Enhancement

📅 2026-04-04

📈 Citations: 0

✨ Influential: 0

career value

225K/year

🤖 AI Summary

This study addresses the lack of a clinically oriented, unified evaluation framework for fundus image enhancement, as conventional metrics fail to adequately assess critical medical features such as lesion preservation and vascular structure. To bridge this gap, we propose EyeBench-V2, a novel benchmark that enables fair comparison of both paired and unpaired enhancement methods within a consistent framework. EyeBench-V2 integrates a structured subjective evaluation protocol designed by ophthalmic experts and directly links enhancement quality to downstream clinical tasks—including lesion and vessel segmentation and diabetic retinopathy grading. By combining generative models, expert assessments, and noise generalization tests, our benchmark systematically evaluates the efficacy and reliability of enhancement methods in realistic clinical settings. This work provides actionable performance insights for clinical research, exposes limitations of current approaches, and paves the way for developing next-generation enhancement models aligned with clinical needs.

Technology Category

Application Category

📝 Abstract

Over the past decade, generative models have demonstrated success in enhancing fundus images. However, the evaluation of these models remains a challenge. A benchmark for fundus image enhancement is needed for three main reasons:(1) Conventional denoising metrics such as PSNR and SSIM fail to capture clinically relevant features, such as lesion preservation and vessel morphology consistency, limiting their applicability in real-world settings; (2) There is a lack of unified evaluation protocols that address both paired and unpaired enhancement methods, particularly those guided by clinical expertise; and (3) An evaluation framework should provide actionable insights to guide future advancements in clinically aligned enhancement models. To address these gaps, we introduce EyeBench-V2, a benchmark designed to bridge the gap between enhancement model performance and clinical utility. Our work offers three key contributions:(1) Multi-dimensional clinical-alignment through downstream evaluations: Beyond standard enhancement metrics, we assess performance across clinically meaningful tasks including vessel segmentation, diabetic retinopathy (DR) grading, generalization to unseen noise patterns, and lesion segmentation. (2) Expert-guided evaluation design: We curate a novel dataset enabling fair comparisons between paired and unpaired enhancement methods, accompanied by a structured manual assessment protocol by medical experts, which evaluates clinically critical aspects such as lesion structure alterations, background color shifts, and the introduction of artificial structures. (3) Actionable insights: Our benchmark provides a rigorous, task-oriented analysis of existing generative models, equipping clinical researchers with the evidence needed to make informed decisions, while also identifying limitations in current methods to inform the design of next-generation enhancement models.

Problem

Research questions and friction points this paper is trying to address.

fundus image enhancement

clinical evaluation

generative models

benchmark

lesion preservation

Innovation

Methods, ideas, or system contributions that make the work stand out.

fundus image enhancement

clinical alignment

expert-guided evaluation