๐ค AI Summary
3D universal lesion detection in CT volumes faces challenges including absence of voxel-level 3D annotations in DeepLesion, severe class imbalance, and lack of anatomical localization. Method: We propose the first fully automatic framework supporting 3D lesion localization, fine-grained classification, and joint anatomical region annotation. Our approach builds upon VFNet as the 2D detection backbone, introduces a novel 2Dโ3D contextual expansion mechanism, and employs a multi-round self-training strategyโachieving performance comparable to full supervision using only 30% of DeepLesion data. Results: The method attains a mean sensitivity of 46.9% across 0.125โ8 false positives per scan, matching the fully supervised baseline. It is the first to enable end-to-end 3D lesion detection with concurrent anatomical region labeling, significantly enhancing clinical utility and generalizability.
๐ Abstract
Radiologists routinely perform the tedious task of lesion localization, classification, and size measurement in computed tomography (CT) studies. Universal lesion detection and tagging (ULDT) can simultaneously help alleviate the cumbersome nature of lesion measurement and enable tumor burden assessment. Previous ULDT approaches utilize the publicly available DeepLesion dataset, however it does not provide the full volumetric (3D) extent of lesions and also displays a severe class imbalance. In this work, we propose a self-training pipeline to detect 3D lesions and tag them according to the body part they occur in. We used a significantly limited 30% subset of DeepLesion to train a VFNet model for 2D lesion detection and tagging. Next, the 2D lesion context was expanded into 3D, and the mined 3D lesion proposals were integrated back into the baseline training data in order to retrain the model over multiple rounds. Through the self-training procedure, our VFNet model learned from its own predictions, detected lesions in 3D, and tagged them. Our results indicated that our VFNet model achieved an average sensitivity of 46.9% at [0.125:8] false positives (FP) with a limited 30% data subset in comparison to the 46.8% of an existing approach that used the entire DeepLesion dataset. To our knowledge, we are the first to jointly detect lesions in 3D and tag them according to the body part label.