Unmasking Interstitial Lung Diseases: Leveraging Masked Autoencoders for Diagnosis

📅 2025-08-06

📈 Citations: 0

✨ Influential: 0

career value

182K/year

🤖 AI Summary

To address the scarcity of high-quality annotated CT data for interstitial lung disease (ILD) diagnosis, this paper proposes a self-supervised learning framework based on the Masked Autoencoder (MAE). The model is pre-trained unsupervisedly on over 5,000 unlabeled chest CT scans, integrating both institutional and public datasets to enhance generalizability. Subsequently, it is fine-tuned via transfer learning for ILD subtype classification. To our knowledge, this is the first systematic application of MAE to ILD CT analysis. The approach substantially alleviates the small-sample bottleneck: with only 32 labeled samples per class, classification accuracy improves by 12.6% over supervised baselines. Moreover, the learned representations exhibit strong clinical interpretability, validated by radiologist-annotated attention maps. The model and source code are publicly released, establishing a novel paradigm for few-shot learning in medical imaging.

Technology Category

Application Category

📝 Abstract

Masked autoencoders (MAEs) have emerged as a powerful approach for pre-training on unlabelled data, capable of learning robust and informative feature representations. This is particularly advantageous in diffused lung disease research, where annotated imaging datasets are scarce. To leverage this, we train an MAE on a curated collection of over 5,000 chest computed tomography (CT) scans, combining in-house data with publicly available scans from related conditions that exhibit similar radiological patterns, such as COVID-19 and bacterial pneumonia. The pretrained MAE is then fine-tuned on a downstream classification task for diffused lung disease diagnosis. Our findings demonstrate that MAEs can effectively extract clinically meaningful features and improve diagnostic performance, even in the absence of large-scale labelled datasets. The code and the models are available here: https://github.com/eedack01/lung_masked_autoencoder.

Problem

Research questions and friction points this paper is trying to address.

Diagnosing interstitial lung diseases using masked autoencoders

Overcoming scarce annotated lung CT scan datasets

Improving diagnostic accuracy with limited labeled data

Innovation

Methods, ideas, or system contributions that make the work stand out.

Uses masked autoencoders for feature learning

Trains on 5000+ CT scans including COVID-19

Fine-tunes MAE for lung disease diagnosis

🔎 Similar Papers

Developing a Dual-Stage Vision Transformer Model for Lung Disease Classification