🤖 AI Summary
Low-fidelity synthesis in abdominal CT volume generation—caused by anatomical complexity and ambiguous boundaries—hampers downstream self-supervised organ segmentation performance.
Method: We propose Lad (Locality-aware diffusion model), a high-fidelity generative framework tailored for self-supervised segmentation. Lad integrates (1) a locality-aware diffusion model with a novel locality-constrained loss to emphasize critical anatomical regions; (2) a label-free abdominal prior extractor that implicitly encodes organ topology and intensity distributions; and (3) end-to-end joint optimization of the generative model, contrastive learning, and segmentation network.
Results: On AbdomenCT-1K, Lad achieves an exceptional FID score of 0.0002 and significantly improves mean Dice for self-supervised organ segmentation. It establishes new state-of-the-art performance on two benchmark abdominal datasets, demonstrating superior generalizability and fidelity in both generation and segmentation tasks.
📝 Abstract
In the realm of medical image analysis, self-supervised learning (SSL) techniques have emerged to alleviate labeling demands, while still facing the challenge of training data scarcity owing to escalating resource requirements and privacy constraints. Numerous efforts employ generative models to generate high-fidelity, unlabeled 3D volumes across diverse modalities and anatomical regions. However, the intricate and indistinguishable anatomical structures within the abdomen pose a unique challenge to abdominal CT volume generation compared to other anatomical regions. To address the overlooked challenge, we introduce the Locality-Aware Diffusion (Lad), a novel method tailored for exquisite 3D abdominal CT volume generation. We design a locality loss to refine crucial anatomical regions and devise a condition extractor to integrate abdominal priori into generation, thereby enabling the generation of large quantities of high-quality abdominal CT volumes essential for SSL tasks without the need for additional data such as labels or radiology reports. Volumes generated through our method demonstrate remarkable fidelity in reproducing abdominal structures, achieving a decrease in FID score from 0.0034 to 0.0002 on AbdomenCT-1K dataset, closely mirroring authentic data and surpassing current methods. Extensive experiments demonstrate the effectiveness of our method in self-supervised organ segmentation tasks, resulting in an improvement in mean Dice scores on two abdominal datasets effectively. These results underscore the potential of synthetic data to advance self-supervised learning in medical image analysis.