🤖 AI Summary
This study addresses the limitations of pulmonary nodule CT datasets, which often suffer from small sample sizes and insufficient diversity, thereby constraining the performance of detection models. To overcome this, the authors propose a two-stage generative adversarial network (TSGAN) that innovatively decouples nodule structure and texture. In the first stage, StyleGAN generates semantic segmentation masks to precisely control anatomical structure; in the second stage, a DL-Pix2Pix framework integrated with a local importance attention mechanism translates these masks into high-fidelity CT images, enhanced by dynamic-weight multi-head window attention for improved detail synthesis. Experiments on LUNA16 demonstrate that detection models trained with TSGAN-generated data achieve a 4.6% increase in accuracy and a 4% improvement in mAP, significantly boosting model generalization and data diversity.
📝 Abstract
The limited sample size and insufficient diversity of lung nodule CT datasets severely restrict the performance and generalization ability of detection models. Existing methods generate images with insufficient diversity and controllability, suffering from issues such as monotonous texture features and distorted anatomical structures. Therefore, we propose a two-stage generative adversarial network (TSGAN) to enhance the diversity and spatial controllability of synthetic data by decoupling the morphological structure and texture features of lung nodules. In the first stage, StyleGAN is used to generate semantic segmentation mask images, encoding lung nodules and tissue backgrounds to control the anatomical structure of lung nodule images; The second stage uses the DL-Pix2Pix model to translate the mask map into CT images, employing local importance attention to capture local features, while utilizing dynamic weight multi-head window attention to enhance the modeling capability of lung nodule texture and background. Compared to the original dataset, the accuracy improved by 4.6% and mAP by 4% on the LUNA16 dataset. Experimental results demonstrate that TSGAN can enhance the quality of synthetic images and the performance of detection models.