PRETI: Patient-Aware Retinal Foundation Model via Metadata-Guided Representation Learning

📅 2025-05-18
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address the scarcity of clinical annotations and poor generalizability in retinal image analysis, this paper proposes a patient-aware retinal foundation model. Methodologically, it introduces two key innovations: (1) Learnable Metadata Embeddings (LME), which incorporate readily available clinical metadata—such as age and sex—into self-supervised representation learning; and (2) Retina-Aware Adaptive Masking (RAAM), a region-sensitive, anatomy-informed dynamic masking strategy. The model is jointly optimized via metadata-guided contrastive learning and masked modeling. Evaluated on ophthalmic disease diagnosis and biomarker prediction tasks, it achieves state-of-the-art performance while demonstrating significantly improved cross-center generalizability. The source code and pre-trained models are publicly released.

Technology Category

Application Category

📝 Abstract
Retinal foundation models have significantly advanced retinal image analysis by leveraging self-supervised learning to reduce dependence on labeled data while achieving strong generalization. Many recent approaches enhance retinal image understanding using report supervision, but obtaining clinical reports is often costly and challenging. In contrast, metadata (e.g., age, gender) is widely available and serves as a valuable resource for analyzing disease progression. To effectively incorporate patient-specific information, we propose PRETI, a retinal foundation model that integrates metadata-aware learning with robust self-supervised representation learning. We introduce Learnable Metadata Embedding (LME), which dynamically refines metadata representations. Additionally, we construct patient-level data pairs, associating images from the same individual to improve robustness against non-clinical variations. To further optimize retinal image representation, we propose Retina-Aware Adaptive Masking (RAAM), a strategy that selectively applies masking within the retinal region and dynamically adjusts the masking ratio during training. PRETI captures both global structures and fine-grained pathological details, resulting in superior diagnostic performance. Extensive experiments demonstrate that PRETI achieves state-of-the-art results across diverse diseases and biomarker predictions using in-house and public data, indicating the importance of metadata-guided foundation models in retinal disease analysis. Our code and pretrained model are available at https://github.com/MICV-yonsei/PRETI
Problem

Research questions and friction points this paper is trying to address.

Reduces reliance on costly clinical reports for retinal analysis
Integrates patient metadata to enhance disease progression understanding
Improves retinal image representation via adaptive masking strategies
Innovation

Methods, ideas, or system contributions that make the work stand out.

Uses metadata-aware learning for retinal analysis
Introduces Learnable Metadata Embedding (LME)
Proposes Retina-Aware Adaptive Masking (RAAM)
🔎 Similar Papers
No similar papers found.