🤖 AI Summary
This work addresses the scarcity of high-quality brain segmentation masks in non-contrast CT neuroimaging, particularly due to the high annotation cost and substantial variability of ischemic infarcts. To tackle this challenge, the authors propose an anatomy-preserving generative framework that, for the first time, integrates a diffusion model into the latent space of a variational autoencoder (VAE) trained on segmentation masks. The method enables unconditional generation of multi-class brain tissue masks containing ischemic infarcts and supports coarse control over lesion presence via binary prompts. By leveraging a frozen VAE decoder to reconstruct masks, the approach effectively preserves global anatomical structure, discrete semantic labels, and realistic pathological variations while avoiding common structural artifacts associated with pixel-level generative models.
📝 Abstract
The scarcity of high-quality segmentation masks remains a major bottleneck for medical image analysis, particularly in non-contrast CT (NCCT) neuroimaging, where manual annotation is costly and variable. To address this limitation, we propose an anatomy-preserving generative framework for the unconditional synthesis of multi-class brain segmentation masks, including ischemic infarcts. The proposed approach combines a variational autoencoder trained exclusively on segmentation masks to learn an anatomical latent representation, with a diffusion model operating in this latent space to generate new samples from pure noise. At inference, synthetic masks are obtained by decoding denoised latent vectors through the frozen VAE decoder, with optional coarse control over lesion presence via a binary prompt. Qualitative results show that the generated masks preserve global brain anatomy, discrete tissue semantics, and realistic variability, while avoiding the structural artifacts commonly observed in pixel-space generative models. Overall, the proposed framework offers a simple and scalable solution for anatomy-aware mask generation in data-scarce medical imaging scenarios.