🤖 AI Summary
To address inaccurate lung localization, inconsistent lesion discrimination, and deployment challenges under resource constraints in pneumonia X-ray diagnosis, this paper proposes an end-to-end lightweight joint framework. First, we design a Transformer-enhanced lightweight TransUNet for precise lung segmentation, achieving a Dice score of 95.68%. Second, we construct a multi-scale ResNet–Transformer hybrid classifier to enable fine-grained lesion identification. Third, we introduce, for the first time, a segmentation-classification co-optimization mechanism with a unified training objective. Evaluated on the Kermany and Cohen datasets, our method attains classification accuracies of 93.75% and 96.04%, respectively—significantly outperforming state-of-the-art approaches. The framework achieves high accuracy, low parameter count (≈3.2M), and strong generalizability, making it particularly suitable for deployment in resource-limited primary healthcare settings.
📝 Abstract
Pneumonia, a prevalent respiratory infection, remains a leading cause of morbidity and mortality worldwide, particularly among vulnerable populations. Chest X-rays serve as a primary tool for pneumonia detection; however, variations in imaging conditions and subtle visual indicators complicate consistent interpretation. Automated tools can enhance traditional methods by improving diagnostic reliability and supporting clinical decision-making. In this study, we propose a novel multi-scale transformer approach for pneumonia detection that integrates lung segmentation and classification into a unified framework. Our method introduces a lightweight transformer-enhanced TransUNet for precise lung segmentation, achieving a Dice score of 95.68% on the"Chest X-ray Masks and Labels"dataset with fewer parameters than traditional transformers. For classification, we employ pre-trained ResNet models (ResNet-50 and ResNet-101) to extract multi-scale feature maps, which are then processed through a modified transformer module to enhance pneumonia detection. This integration of multi-scale feature extraction and lightweight transformer modules ensures robust performance, making our method suitable for resource-constrained clinical environments. Our approach achieves 93.75% accuracy on the"Kermany"dataset and 96.04% accuracy on the"Cohen"dataset, outperforming existing methods while maintaining computational efficiency. This work demonstrates the potential of multi-scale transformer architectures to improve pneumonia diagnosis, offering a scalable and accurate solution to global healthcare challenges."https://github.com/amirrezafateh/Multi-Scale-Transformer-Pneumonia"