GazeLT: Visual attention-guided long-tailed disease classification in chest radiographs

📅 2025-08-13

📈 Citations: 0

✨ Influential: 0

career value

200K/year

🤖 AI Summary

This study addresses the limited performance of long-tailed disease classification in chest X-rays. We propose a novel method that integrates radiologists’ eye-tracking–guided temporal attention patterns—introducing, for the first time, dynamic, sequential visual search attention into long-tailed learning frameworks. Unlike static attention modeling, our approach features an integrate-and-dissociate dual-path architecture: one path captures global contextual information, while the other models the temporal evolution of localized pathological regions. The method synergistically combines eye-tracking–driven attention guidance, deep neural networks, and a customized long-tailed loss function. Evaluated on NIH-CXR-LT and MIMIC-CXR-LT benchmarks, it achieves average accuracy improvements of 4.1% over the best-performing long-tailed loss baseline and 21.7% over mainstream attention-based methods. Notably, it significantly enhances recognition of rare diseases and incidental, transient lesions—key challenges in clinical radiology.

Technology Category

Application Category

📝 Abstract

In this work, we present GazeLT, a human visual attention integration-disintegration approach for long-tailed disease classification. A radiologist's eye gaze has distinct patterns that capture both fine-grained and coarser level disease related information. While interpreting an image, a radiologist's attention varies throughout the duration; it is critical to incorporate this into a deep learning framework to improve automated image interpretation. Another important aspect of visual attention is that apart from looking at major/obvious disease patterns, experts also look at minor/incidental findings (few of these constituting long-tailed classes) during the course of image interpretation. GazeLT harnesses the temporal aspect of the visual search process, via an integration and disintegration mechanism, to improve long-tailed disease classification. We show the efficacy of GazeLT on two publicly available datasets for long-tailed disease classification, namely the NIH-CXR-LT (n=89237) and the MIMIC-CXR-LT (n=111898) datasets. GazeLT outperforms the best long-tailed loss by 4.1% and the visual attention-based baseline by 21.7% in average accuracy metrics for these datasets. Our code is available at https://github.com/lordmoinak1/gazelt.

Problem

Research questions and friction points this paper is trying to address.

Improving long-tailed disease classification in chest radiographs

Integrating radiologist's visual attention patterns into deep learning

Addressing minor incidental findings in automated image interpretation

Innovation

Methods, ideas, or system contributions that make the work stand out.

Integrates radiologist gaze patterns for classification

Uses temporal visual search process dynamics

Improves long-tailed disease classification accuracy

🔎 Similar Papers

Multi-modal vision-language model for generalizable annotation-free pathology localization and clinical diagnosis