Developing a Dual-Stage Vision Transformer Model for Lung Disease Classification

📅 2024-09-26
🏛️ arXiv.org
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the fine-grained classification of 14 pulmonary diseases in chest X-ray images. We propose a novel two-stage cascaded vision transformer architecture that uniquely integrates ViT—capturing global semantic context—with Swin Transformer—modeling local texture and lesion-level details—thereby jointly encoding long-range dependencies and discriminative pathological features. The model is trained end-to-end under supervised learning, incorporating medical-image-specific preprocessing. On an independent test set, it achieves a label-wise classification accuracy of 92.06%, outperforming single-stage baselines by +3.2 percentage points, demonstrating enhanced robustness for multi-class pulmonary disease discrimination. This study establishes a new paradigm for interpretable, high-accuracy AI-assisted diagnosis of thoracic pathologies.

Technology Category

Application Category

📝 Abstract
Lung diseases have become a prevalent problem throughout the United States, affecting over 34 million people. Accurate and timely diagnosis of the different types of lung diseases is critical, and Artificial Intelligence (AI) methods could speed up these processes. A dual-stage vision transformer is built throughout this research by integrating a Vision Transformer (ViT) and a Swin Transformer to classify 14 different lung diseases from X-ray scans of patients with these diseases. The proposed model achieved an accuracy of 92.06% on a label-level when making predictions on an unseen testing subset of the dataset after data preprocessing and training the neural network. The model showed promise for accurately classifying lung diseases and diagnosing patients who suffer from these harmful diseases.
Problem

Research questions and friction points this paper is trying to address.

Classifying 14 lung diseases from X-ray scans
Improving accuracy in lung disease diagnosis
Developing a dual-stage vision transformer model
Innovation

Methods, ideas, or system contributions that make the work stand out.

Dual-stage Vision Transformer model
Integrates ViT and Swin Transformer
Classifies 14 lung diseases
🔎 Similar Papers
No similar papers found.
A
Anirudh Mazumder
Department of Engineering, University of North Texas
J
Jianguo Liu
Department of Mathematics, University of North Texas