Artificially Generated Visual Scanpath Improves Multilabel Thoracic Disease Classification in Chest X-Ray Images

📅 2025-03-01

🏛️ IEEE Transactions on Instrumentation and Measurement

📈 Citations: 0

✨ Influential: 0

career value

133K/year

🤖 AI Summary

Current deep learning models for multi-label chest X-ray (CXR) disease classification suffer from limited interpretability and suboptimal performance due to the absence of expert-informed visual scanning patterns. Method: We propose the first end-to-end framework integrating human-like scan path generation with iterative sequential modeling (ISM). Specifically: (1) an RNN-based scan path predictor generates anatomically grounded, human-like visual trajectories; (2) an attention-enhanced ISM module dynamically fuses these sequential paths with multi-scale CNN image features; and (3) the entire architecture is jointly optimized. Results: Evaluated on ~200K images from NIH-CXR and PadChest, our method achieves a 3.2% mAP improvement over path-agnostic baselines across 14 disease classes. This work pioneers the systematic integration of radiologist visual behavior modeling into CXR multi-label diagnosis, simultaneously enhancing both diagnostic accuracy and decision interpretability.

Technology Category

Application Category

📝 Abstract

Expert radiologists visually scan Chest X-Ray (CXR) images, sequentially fixating on anatomical structures to perform disease diagnosis. An automatic multilabel classifier of diseases in CXR images can benefit by incorporating aspects of the radiologists’ approach. Recorded visual scanpaths of radiologists on CXR images can be used for the said purpose. But, such scanpaths are not available for most CXR images, which creates a gap even for modern deep learning-based classifiers. This article proposes to mitigate this gap by generating effective artificial visual scanpaths using a visual scanpath prediction model for CXR images. Further, a multiclass multilabel classifier framework is proposed that uses a generated scanpath and visual image features to classify diseases in CXR images. While the scanpath predictor is based on a recurrent neural network, the multilabel classifier involves a novel iterative sequential model (ISM) with an attention module. We show that our scanpath predictor generates human-like visual scanpaths. We also demonstrate that the use of artificial visual scanpaths improves multiclass multilabel disease classification results on CXR images. The above observations are made from experiments involving around 0.2 million CXR images from two widely used datasets considering the multilabel classification of 14 pathological findings. Code link: (https://github.com/ashishverma03/SDC).

Problem

Research questions and friction points this paper is trying to address.

Generates artificial visual scanpaths for CXR images

Improves multi-label thoracic disease classification

Uses deep learning to mimic radiologists' diagnostic approach

Innovation

Methods, ideas, or system contributions that make the work stand out.

Artificial visual scanpaths generated for CXR images

Recurrent neural network predicts human-like scanpaths

Iterative sequential model with attention improves classification

🔎 Similar Papers

Multi-modal vision-language model for generalizable annotation-free pathology localization and clinical diagnosis