GlaBoost: A multimodal Structured Framework for Glaucoma Risk Stratification

📅 2025-08-03

📈 Citations: 0

✨ Influential: 0

career value

169K/year

🤖 AI Summary

Glaucoma early detection is hindered clinically by overreliance on single-modality data and insufficient model interpretability. To address this, we propose the first multimodal gradient-boosting framework integrating structured clinical metrics, fundus image embeddings, and expert-derived textual descriptions. Our method employs a pretrained CNN to encode fundus images, a Transformer to process clinical text, and XGBoost for heterogeneous feature fusion and interpretable prediction. Key features identified—such as cup-to-disc ratio and optic cup pallor—align with established clinical knowledge, substantially enhancing decision transparency and trustworthiness. Evaluated on a real-world dataset, our model achieves 98.71% accuracy, significantly outperforming unimodal approaches and conventional multimodal baselines. This work establishes a novel, high-accuracy, and highly interpretable clinical decision-support paradigm for glaucoma risk stratification.

Technology Category

Application Category

📝 Abstract

Early and accurate detection of glaucoma is critical to prevent irreversible vision loss. However, existing methods often rely on unimodal data and lack interpretability, limiting their clinical utility. In this paper, we present GlaBoost, a multimodal gradient boosting framework that integrates structured clinical features, fundus image embeddings, and expert-curated textual descriptions for glaucoma risk prediction. GlaBoost extracts high-level visual representations from retinal fundus photographs using a pretrained convolutional encoder and encodes free-text neuroretinal rim assessments using a transformer-based language model. These heterogeneous signals, combined with manually assessed risk scores and quantitative ophthalmic indicators, are fused into a unified feature space for classification via an enhanced XGBoost model. Experiments conducted on a real-world annotated dataset demonstrate that GlaBoost significantly outperforms baseline models, achieving a validation accuracy of 98.71%. Feature importance analysis reveals clinically consistent patterns, with cup-to-disc ratio, rim pallor, and specific textual embeddings contributing most to model decisions. GlaBoost offers a transparent and scalable solution for interpretable glaucoma diagnosis and can be extended to other ophthalmic disorders.

Problem

Research questions and friction points this paper is trying to address.

Early accurate glaucoma detection to prevent vision loss

Overcoming unimodal data limitations in existing methods

Integrating multimodal data for interpretable risk prediction

Innovation

Methods, ideas, or system contributions that make the work stand out.

Multimodal fusion of clinical and image data

Pretrained CNN and transformer for feature extraction

Enhanced XGBoost for interpretable risk prediction

🔎 Similar Papers

GlaLSTM: A Concurrent LSTM Stream Framework for Glaucoma Detection via Biomarker Mining