Random forest-based out-of-distribution detection for robust lung cancer segmentation

📅 2025-08-26
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address the limited out-of-distribution (OOD) generalization of lung lesion segmentation in CT imaging, this paper proposes RF-Deep: a method built upon a Swin Transformer encoder and convolutional decoder, pretrained via SimMIM self-supervision. It leverages deep-layer features from the frozen pretrained encoder to train a lightweight random forest classifier for efficient and interpretable OOD detection—without fine-tuning the backbone. Evaluated on multi-source CT datasets (pulmonary embolism, COVID-19, and abdominal CT), RF-Deep achieves FPR95 scores of 18.26%, 27.66%, and <0.1%, respectively, surpassing existing OOD detection approaches. Its core contribution is the first integration of deep features from pretrained vision Transformers with random forests for OOD identification in medical imaging—achieving a favorable trade-off among detection performance, computational efficiency, and clinical interpretability.

Technology Category

Application Category

📝 Abstract
Accurate detection and segmentation of cancerous lesions from computed tomography (CT) scans is essential for automated treatment planning and cancer treatment response assessment. Transformer-based models with self-supervised pretraining can produce reliably accurate segmentation from in-distribution (ID) data but degrade when applied to out-of-distribution (OOD) datasets. We address this challenge with RF-Deep, a random forest classifier that utilizes deep features from a pretrained transformer encoder of the segmentation model to detect OOD scans and enhance segmentation reliability. The segmentation model comprises a Swin Transformer encoder, pretrained with masked image modeling (SimMIM) on 10,432 unlabeled 3D CT scans covering cancerous and non-cancerous conditions, with a convolution decoder, trained to segment lung cancers in 317 3D scans. Independent testing was performed on 603 3D CT public datasets that included one ID dataset and four OOD datasets comprising chest CTs with pulmonary embolism (PE) and COVID-19, and abdominal CTs with kidney cancers and healthy volunteers. RF-Deep detected OOD cases with a FPR95 of 18.26%, 27.66%, and less than 0.1% on PE, COVID-19, and abdominal CTs, consistently outperforming established OOD approaches. The RF-Deep classifier provides a simple and effective approach to enhance reliability of cancer segmentation in ID and OOD scenarios.
Problem

Research questions and friction points this paper is trying to address.

Detecting out-of-distribution CT scans for reliable segmentation
Improving lung cancer segmentation accuracy on diverse datasets
Enhancing transformer model robustness against distribution shifts
Innovation

Methods, ideas, or system contributions that make the work stand out.

Random forest classifier using deep features
Swin Transformer encoder with masked pretraining
OOD detection enhancing segmentation reliability
🔎 Similar Papers
No similar papers found.