π€ AI Summary
To address the challenge of limited fundus image data hindering pretraining of ophthalmic AI models, this paper proposes a hierarchical feature-aware generative framework. The method innovatively integrates a feature pyramid encoder with a modified StyleGAN architecture to jointly preserve anatomical structure fidelity and model pathological details. Dilated convolutions and adaptive upsampling are incorporated to enhance multi-scale feature representation. Extensive validation is conducted on multi-center datasetsβDDR, DRIVE, and IDRiD. On DDR, the generated images achieve SSIM = 0.8863 and FID = 54.2. When used for few-shot training, the synthetic data boost ResNet50βs retinal disease diagnosis accuracy by 6.49%. This work establishes a scalable, generative solution for efficient development of ophthalmic AI models under low-data regimes.
π Abstract
Recent advancements in ophthalmology foundation models such as RetFound have demonstrated remarkable diagnostic capabilities but require massive datasets for effective pre-training, creating significant barriers for development and deployment. To address this critical challenge, we propose FundusGAN, a novel hierarchical feature-aware generative framework specifically designed for high-fidelity fundus image synthesis. Our approach leverages a Feature Pyramid Network within its encoder to comprehensively extract multi-scale information, capturing both large anatomical structures and subtle pathological features. The framework incorporates a modified StyleGAN-based generator with dilated convolutions and strategic upsampling adjustments to preserve critical retinal structures while enhancing pathological detail representation. Comprehensive evaluations on the DDR, DRIVE, and IDRiD datasets demonstrate that FundusGAN consistently outperforms state-of-the-art methods across multiple metrics (SSIM: 0.8863, FID: 54.2, KID: 0.0436 on DDR). Furthermore, disease classification experiments reveal that augmenting training data with FundusGAN-generated images significantly improves diagnostic accuracy across multiple CNN architectures (up to 6.49% improvement with ResNet50). These results establish FundusGAN as a valuable foundation model component that effectively addresses data scarcity challenges in ophthalmological AI research, enabling more robust and generalizable diagnostic systems while reducing dependency on large-scale clinical data collection.