Automated Glaucoma Report Generation via Dual-Attention Semantic Parallel-LSTM and Multimodal Clinical Data Integration

📅 2025-10-11
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Glaucoma automatic diagnostic report generation suffers from two key limitations: (1) redundant content and (2) insufficient representation of critical pathological features—such as optic disc cupping, retinal nerve fiber layer (RNFL) defects, and visual field abnormalities. To address these, we propose the Dual-Attention Semantic Parallel LSTM (DA-SPL) network. DA-SPL innovatively integrates a cross-modal dual-attention mechanism, a parallel LSTM decoder architecture, and a label-enhancement module within an encoder–decoder framework to enable deep multimodal fusion of fundus images and heterogeneous clinical data. Experimental results on standard benchmarks demonstrate that DA-SPL significantly outperforms state-of-the-art methods in both semantic coherence and clinical accuracy. Notably, it achieves substantial improvements in detecting core glaucomatous lesions and generating precise, clinically appropriate terminology.

Technology Category

Application Category

📝 Abstract
Generative AI for automated glaucoma diagnostic report generation faces two predominant challenges: content redundancy in narrative outputs and inadequate highlighting of pathologically significant features including optic disc cupping, retinal nerve fiber layer defects, and visual field abnormalities. These limitations primarily stem from current multimodal architectures' insufficient capacity to extract discriminative structural-textural patterns from fundus imaging data while maintaining precise semantic alignment with domain-specific terminology in comprehensive clinical reports. To overcome these constraints, we present the Dual-Attention Semantic Parallel-LSTM Network (DA-SPL), an advanced multimodal generation framework that synergistically processes both fundus imaging and supplementary visual inputs. DA-SPL employs an Encoder-Decoder structure augmented with the novel joint dual-attention mechanism in the encoder for cross-modal feature refinement, the parallelized LSTM decoder architecture for enhanced temporal-semantic consistency, and the specialized label enhancement module for accurate disease-relevant term generation. Rigorous evaluation on standard glaucoma datasets demonstrates DA-SPL's consistent superiority over state-of-the-art models across quantitative metrics. DA-SPL exhibits exceptional capability in extracting subtle pathological indicators from multimodal inputs while generating diagnostically precise reports that exhibit strong concordance with clinical expert annotations.
Problem

Research questions and friction points this paper is trying to address.

Addresses content redundancy in automated glaucoma diagnostic reports
Improves highlighting of pathological features in medical imaging
Enhances semantic alignment between clinical data and generated reports
Innovation

Methods, ideas, or system contributions that make the work stand out.

Dual-attention mechanism for cross-modal feature refinement
Parallel LSTM decoder for temporal-semantic consistency
Label enhancement module for disease-relevant term generation
🔎 Similar Papers
No similar papers found.
C
Cheng Huang
Southern Methodist University
Weizheng Xie
Weizheng Xie
MSCS, Southern Methodist University
Medical ImageAI for Finance
Z
Zeyu Han
Southern Methodist University
T
Tsengdar Lee
National Aeronautics and Space Administration
Karanjit Kooner
Karanjit Kooner
Associate Professor of Ophthalmology, University of Texas Southwestern Medical Center, Dallas.
Glaucoma
J
Jui-Ka Wang
University of Texas Southwestern Medical Center
N
Ning Zhang
Northeastern University
J
Jia Zhang
Southern Methodist University