π€ AI Summary
Medical image multi-object segmentation faces a fundamental trade-off among accuracy, inference speed, memory efficiency, and robustness to imaging noise. To address this, we propose LDSeg, a latent-space conditional diffusion modelβthe first to formulate conditional diffusion directly in the latent space for medical segmentation. LDSeg jointly learns, in an end-to-end manner, image embeddings and low-dimensional shape manifolds of target anatomical structures. By leveraging implicit posterior sampling, it overcomes three key limitations of conventional diffusion models: excessive memory consumption, slow sampling, and unnatural noise priors. Evaluated on three diverse medical imaging modalities, LDSeg achieves state-of-the-art segmentation accuracy, significantly accelerates inference, and demonstrates superior robustness to acquisition noise compared to leading deterministic segmentation methods.
π Abstract
Diffusion Probabilistic Models (DPMs) suffer from inefficient inference due to their slow sampling and high memory consumption, which limits their applicability to various medical imaging applications. In this work, we propose a novel conditional diffusion modeling framework (LDSeg) for medical image segmentation, utilizing the learned inherent low-dimensional latent shape manifolds of the target objects and the embeddings of the source image with an end-to-end framework. Conditional diffusion in latent space not only ensures accurate image segmentation for multiple interacting objects, but also tackles the fundamental issues of traditional DPM-based segmentation methods: (1) high memory consumption, (2) time-consuming sampling process, and (3) unnatural noise injection in the forward and reverse processes. The end-to-end training strategy enables robust representation learning in the latent space related to segmentation features, ensuring significantly faster sampling from the posterior distribution for segmentation generation in the inference phase. Our experiments demonstrate that LDSeg achieved state-of-the-art segmentation accuracy on three medical image datasets with different imaging modalities. In addition, we showed that our proposed model was significantly more robust to noise compared to traditional deterministic segmentation models. The code is available at https://github.com/FahimZaman/LDSeg.git.