GLEAM: A Multimodal Imaging Dataset and HAMM for Glaucoma Classification

πŸ“… 2026-03-13
πŸ“ˆ Citations: 0
✨ Influential: 0
πŸ“„ PDF
πŸ€– AI Summary
This study addresses the challenge of precise glaucoma staging, which is hindered by the scarcity of multimodal imaging data and difficulties in cross-modal fusion. To this end, we introduce GLEAM, the first publicly available trimodal glaucoma dataset encompassing fundus photographs, optical coherence tomography (OCT) scans, and visual field maps. We further propose a Hierarchical Attention Masked Modeling (HAMM) framework that leverages a lightweight encoder to jointly learn visual, structural, and functional representations. By effectively capturing complementary information across modalities, HAMM significantly improves accuracy in four-stage glaucoma classification, thereby providing both a high-quality data foundation and an efficient algorithmic solution to support clinical diagnosis.

Technology Category

Application Category

πŸ“ Abstract
We propose glaucoma lesion evaluation and analysis with multimodal imaging (GLEAM), the first publicly available tri-modal glaucoma dataset comprising scanning laser ophthalmoscopy fundus images, circumpapillary OCT images, and visual field pattern deviation maps, annotated with four disease stages, enabling effective exploitation of multimodal complementary information and facilitating accurate diagnosis and treatment across disease stages. To effectively integrate cross-modal information, we propose hierarchical attentive masked modeling (HAMM) for multimodal glaucoma classification. Our framework employs hierarchical attentive encoders and light decoders to focus cross-modal representation learning on the encoder.
Problem

Research questions and friction points this paper is trying to address.

glaucoma
multimodal imaging
disease staging
medical image classification
cross-modal integration
Innovation

Methods, ideas, or system contributions that make the work stand out.

multimodal imaging
glaucoma classification
hierarchical attentive masked modeling
cross-modal representation learning
public dataset
πŸ”Ž Similar Papers
No similar papers found.
J
Jiao Wang
College of the Information Science and Engineering, Northeastern University, Shenyang 110819, China
C
Chi Liu
Department of Ophthalmology, Shenyang Fourth People's Hospital, Shenyang 110000, China
Yiying Zhang
Yiying Zhang
Department of Mathematics; Southern University of Science and Technology
Optimal (re)insuranceCatastrophe insuranceRisk sharingRisk mitigationSystemic risks
H
Hongchen Luo
College of the Information Science and Engineering, Northeastern University, Shenyang 110819, China
Z
Zhifen Guo
College of the Information Science and Engineering, Northeastern University, Shenyang 110819, China
Ying Hu
Ying Hu
Professor of Mathematics, UniversitΓ© Rennes
stochastic analysiscontrol and optimizationmathematical finance
K
Ke Xu
Department of Ophthalmology, Shenyang Fourth People's Hospital, Shenyang 110000, China
J
Jing Zhou
Department of Ophthalmology, Shenyang Fourth People's Hospital, Shenyang 110000, China
Hongyan Xu
Hongyan Xu
Tianjin University
Text GenerationRecommender SystemGraph Learning
R
Ruiting Zhou
College of the Information Science and Engineering, Northeastern University, Shenyang 110819, China
M
Man Tang
College of the Information Science and Engineering, Northeastern University, Shenyang 110819, China