🤖 AI Summary
Current deep learning models for multi-label chest X-ray (CXR) disease classification suffer from limited interpretability and suboptimal performance due to the absence of expert-informed visual scanning patterns. Method: We propose the first end-to-end framework integrating human-like scan path generation with iterative sequential modeling (ISM). Specifically: (1) an RNN-based scan path predictor generates anatomically grounded, human-like visual trajectories; (2) an attention-enhanced ISM module dynamically fuses these sequential paths with multi-scale CNN image features; and (3) the entire architecture is jointly optimized. Results: Evaluated on ~200K images from NIH-CXR and PadChest, our method achieves a 3.2% mAP improvement over path-agnostic baselines across 14 disease classes. This work pioneers the systematic integration of radiologist visual behavior modeling into CXR multi-label diagnosis, simultaneously enhancing both diagnostic accuracy and decision interpretability.
📝 Abstract
Expert radiologists visually scan Chest X-Ray (CXR) images, sequentially fixating on anatomical structures to perform disease diagnosis. An automatic multilabel classifier of diseases in CXR images can benefit by incorporating aspects of the radiologists’ approach. Recorded visual scanpaths of radiologists on CXR images can be used for the said purpose. But, such scanpaths are not available for most CXR images, which creates a gap even for modern deep learning-based classifiers. This article proposes to mitigate this gap by generating effective artificial visual scanpaths using a visual scanpath prediction model for CXR images. Further, a multiclass multilabel classifier framework is proposed that uses a generated scanpath and visual image features to classify diseases in CXR images. While the scanpath predictor is based on a recurrent neural network, the multilabel classifier involves a novel iterative sequential model (ISM) with an attention module. We show that our scanpath predictor generates human-like visual scanpaths. We also demonstrate that the use of artificial visual scanpaths improves multiclass multilabel disease classification results on CXR images. The above observations are made from experiments involving around 0.2 million CXR images from two widely used datasets considering the multilabel classification of 14 pathological findings. Code link: (https://github.com/ashishverma03/SDC).