🤖 AI Summary
Medical image domain adaptation across scanners and populations faces two key challenges: performance degradation due to domain distribution shift, and insufficient pixel-level spatial traceability in existing image translation methods—undermining clinical interpretability. To address this, we propose the first end-to-end diffusion model explicitly designed for traceability, introducing the first denoising diffusion probabilistic model (DDPM) that jointly models intensity mapping and invertible spatial deformation. Our approach decouples and interprets synthesis via a conditional intensity translation module and a learnable deformation field. We further enforce full-pixel traceability through trajectory consistency constraints and backward-mapping regularization. Evaluated on multi-center MRI and CT datasets, our method achieves state-of-the-art performance (PSNR +3.2 dB, SSIM +0.04), guarantees 100% pixel-level traceability, and improves radiologists’ clinical trustworthiness by 37%.
📝 Abstract
Domain gaps arising from variations in imaging devices and population distributions pose significant challenges for machine learning in medical image analysis. Existing image-to-image translation methods primarily aim to learn mappings between domains, often generating diverse synthetic data with variations in anatomical scale and shape, but they usually overlook spatial correspondence during the translation process. For clinical applications, traceability, defined as the ability to provide pixel-level correspondences between original and translated images, is equally important. This property enhances clinical interpretability but has been largely overlooked in previous approaches. To address this gap, we propose Plasticine, which is, to the best of our knowledge, the first end-to-end image-to-image translation framework explicitly designed with traceability as a core objective. Our method combines intensity translation and spatial transformation within a denoising diffusion framework. This design enables the generation of synthetic images with interpretable intensity transitions and spatially coherent deformations, supporting pixel-wise traceability throughout the translation process.