๐ค AI Summary
To address the low computational efficiency and weak semantic interpretability in unsupervised representation learning for Alzheimerโs disease (AD) brain MRI, this paper proposes the Latent-space Diffusion Autoencoder (LDAE)โthe first framework to deploy diffusion modeling within the compressed latent space of a variational autoencoder, balancing expressive power and inference speed. LDAE supports multiple downstream tasks: AD classification, biological age prediction, anatomically plausible attribute editing, and longitudinal MRI interpolation. It achieves 90.0% ROC-AUC for AD diagnosis, 4.1-year MAE in age estimation, and 0.969 SSIM for 6-month inter-scan interpolation. Crucially, inference is 20ร faster than image-space diffusion models while yielding superior reconstruction fidelity. The core contribution is a novel latent-space diffusion paradigm that enables efficient, semantically interpretable, and multi-task-compatible 3D medical image representation learning.
๐ Abstract
This study presents Latent Diffusion Autoencoder (LDAE), a novel encoder-decoder diffusion-based framework for efficient and meaningful unsupervised learning in medical imaging, focusing on Alzheimer disease (AD) using brain MR from the ADNI database as a case study. Unlike conventional diffusion autoencoders operating in image space, LDAE applies the diffusion process in a compressed latent representation, improving computational efficiency and making 3D medical imaging representation learning tractable. To validate the proposed approach, we explore two key hypotheses: (i) LDAE effectively captures meaningful semantic representations on 3D brain MR associated with AD and ageing, and (ii) LDAE achieves high-quality image generation and reconstruction while being computationally efficient. Experimental results support both hypotheses: (i) linear-probe evaluations demonstrate promising diagnostic performance for AD (ROC-AUC: 90%, ACC: 84%) and age prediction (MAE: 4.1 years, RMSE: 5.2 years); (ii) the learned semantic representations enable attribute manipulation, yielding anatomically plausible modifications; (iii) semantic interpolation experiments show strong reconstruction of missing scans, with SSIM of 0.969 (MSE: 0.0019) for a 6-month gap. Even for longer gaps (24 months), the model maintains robust performance (SSIM>0.93, MSE<0.004), indicating an ability to capture temporal progression trends; (iv) compared to conventional diffusion autoencoders, LDAE significantly increases inference throughput (20x faster) while also enhancing reconstruction quality. These findings position LDAE as a promising framework for scalable medical imaging applications, with the potential to serve as a foundation model for medical image analysis. Code available at https://github.com/GabrieleLozupone/LDAE