🤖 AI Summary
Current diabetic retinopathy (DR) grading methods over-rely on visual features, neglect cross-center stable pathological patterns, and struggle with borderline cases. To address these limitations, we propose a large language model (LLM)-driven pathology-aware prototype evolution framework. Our method constructs a variance-spectrum-guided hierarchical anchor prototype library, designs a hierarchical differential prompt gating mechanism to dynamically fuse visual representations from vision-language models (VLMs) with LLM-derived semantic knowledge, and incorporates a Pathological Semantic Injector (PSI) and Discriminative Prototype Enhancer (DPE) for two-stage prototype modulation. This is the first approach to enable cross-modal, pathology-informed adaptive prototype evolution. Evaluated on eight public multi-center datasets, it significantly outperforms state-of-the-art methods, achieving superior accuracy and generalization robustness in DR grading.
📝 Abstract
Diabetic retinopathy (DR) grading plays a critical role in early clinical intervention and vision preservation. Recent explorations predominantly focus on visual lesion feature extraction through data processing and domain decoupling strategies. However, they generally overlook domain-invariant pathological patterns and underutilize the rich contextual knowledge of foundation models, relying solely on visual information, which is insufficient for distinguishing subtle pathological variations. Therefore, we propose integrating fine-grained pathological descriptions to complement prototypes with additional context, thereby resolving ambiguities in borderline cases. Specifically, we propose a Hierarchical Anchor Prototype Modulation (HAPM) framework to facilitate DR grading. First, we introduce a variance spectrum-driven anchor prototype library that preserves domain-invariant pathological patterns. We further employ a hierarchical differential prompt gating mechanism, dynamically selecting discriminative semantic prompts from both LVLM and LLM sources to address semantic confusion between adjacent DR grades. Finally, we utilize a two-stage prototype modulation strategy that progressively integrates clinical knowledge into visual prototypes through a Pathological Semantic Injector (PSI) and a Discriminative Prototype Enhancer (DPE). Extensive experiments across eight public datasets demonstrate that our approach achieves pathology-guided prototype evolution while outperforming state-of-the-art methods. The code is available at https://github.com/zhcz328/HAPM.