🤖 AI Summary
Automatic chest X-ray (CXR) analysis faces challenges including weak disease signals, substantial data bias, and insufficient spatial supervision. Method: This work pioneers the adaptation of the medical image segmentation foundation model MedSAM for lung segmentation and proposes a lung-mask-guided multi-label classification framework. By fine-tuning MedSAM to generate anatomically consistent lung masks, we integrate them as spatial priors into a ResNet50-based classifier; a dynamic mask-tightness mechanism balances abnormality detection and normal-class discrimination. Training jointly on NIH CXR and Airlangga datasets incorporates domain diversity. Results: The method achieves a macro-AUROC of ≈0.82, improves “No Finding” specificity by 12.3%, and enhances training efficiency by 37%. This study establishes a novel paradigm—dynamically selecting spatial priors based on model architecture and clinical objectives—significantly improving robustness and interpretability in CXR analysis.
📝 Abstract
Chest X-ray (CXR) imaging is widely used for screening and diagnosing pulmonary abnormalities, yet automated interpretation remains challenging due to weak disease signals, dataset bias, and limited spatial supervision. Foundation models for medical image segmentation (MedSAM) provide an opportunity to introduce anatomically grounded priors that may improve robustness and interpretability in CXR analysis. We propose a segmentation-guided CXR classification pipeline that integrates MedSAM as a lung region extraction module prior to multi-label abnormality classification. MedSAM is fine-tuned using a public image-mask dataset from Airlangga University Hospital. We then apply it to a curated subset of the public NIH CXR dataset to train and evaluate deep convolutional neural networks for multi-label prediction of five abnormalities (Mass, Nodule, Pneumonia, Edema, and Fibrosis), with the normal case (No Finding) evaluated via a derived score. Experiments show that MedSAM produces anatomically plausible lung masks across diverse imaging conditions. We find that masking effects are both task-dependent and architecture-dependent. ResNet50 trained on original images achieves the strongest overall abnormality discrimination, while loose lung masking yields comparable macro AUROC but significantly improves No Finding discrimination, indicating a trade-off between abnormality-specific classification and normal case screening. Tight masking consistently reduces abnormality level performance but improves training efficiency. Loose masking partially mitigates this degradation by preserving perihilar and peripheral context. These results suggest that lung masking should be treated as a controllable spatial prior selected to match the backbone and clinical objective, rather than applied uniformly.