WFM: 3D Wavelet Flow Matching for Ultrafast Multi-Modal MRI Synthesis

📅 2026-04-22
📈 Citations: 0
Influential: 0
📄 PDF

career value

225K/year
🤖 AI Summary
This work addresses the high computational cost and clinical impracticality of existing diffusion models for multimodal MRI synthesis, which typically require hundreds of sampling steps and separate models per modality. The authors propose WFM, the first method to integrate flow matching with wavelet-domain priors: it leverages the conditional modality mean in wavelet space as an informative prior and directly learns a mapping to the target distribution via flow matching, enabling high-quality synthesis in just one or two integration steps. A single WFM model unifies generation across four MRI modalities, reducing parameters by 75% and accelerating inference by 250–1000× (0.16–0.64 seconds per volume). On BraTS 2024, it achieves 26.8 dB PSNR and 0.94 SSIM, substantially overcoming the barriers to real-time clinical deployment.

Technology Category

Application Category

📝 Abstract
Diffusion models have achieved remarkable quality in multi-modal MRI synthesis, but their computational cost (hundreds of sampling steps and separate models per modality) limits clinical deployment. We observe that this inefficiency stems from an unnecessary starting point: diffusion begins from pure noise, discarding the structural information already present in available MRI sequences. We propose WFM (Wavelet Flow Matching), which instead learns a direct flow from an informed prior, the mean of conditioning modalities in wavelet space, to the target distribution. Because the source and target share underlying anatomy and differ primarily in contrast, this formulation enables accurate synthesis in just 1-2 integration steps. A single 82M-parameter model with class conditioning synthesizes all four BraTS modalities (T1, T1c, T2, FLAIR), replacing four separate diffusion models totaling 326M parameters. On BraTS 2024, WFM achieves 26.8 dB PSNR and 0.94 SSIM, within 1-2 dB of diffusion baselines, while running 250-1000x faster (0.16-0.64s vs. 160s per volume). This speed-quality trade-off makes real-time MRI synthesis practical for clinical workflows. Code is available at https://github.com/yalcintur/WFM.
Problem

Research questions and friction points this paper is trying to address.

multi-modal MRI synthesis
diffusion models
computational efficiency
clinical deployment
medical image generation
Innovation

Methods, ideas, or system contributions that make the work stand out.

Wavelet Flow Matching
multi-modal MRI synthesis
fast generative modeling
flow matching
clinical deployment
Y
Yalcin Tur
Department of Computer Science, Stanford University, Stanford, CA, USA
M
Mihajlo Stojkovic
Department of Computer Science, Stanford University, Stanford, CA, USA
Ulas Bagci
Ulas Bagci
Northwestern University
artificial intelligencedeep learningbiomedical image analysismedical image computing