🤖 AI Summary
To address the mismatch between linear motion assumptions and realistic nonlinear, quasi-periodic respiratory motion in 4D medical image temporal interpolation, this paper proposes a novel frequency-domain-driven generative interpolation paradigm. Methodologically, we introduce a Fourier motion operator that jointly incorporates physiological priors and spectral information from feature space, and explicitly models frequency dynamics during diffusion via a basis interaction mechanism. Additionally, we integrate a variational autoencoder with frequency-domain constraints and employ optical flow contrastive learning to enhance temporal consistency. Experiments demonstrate state-of-the-art performance in PSNR, SSIM, and perceptual quality metrics. Our method significantly improves anatomical fidelity, temporal continuity, and motion naturalness of interpolated frames while maintaining high reconstruction accuracy.
📝 Abstract
The temporal interpolation task for 4D medical imaging, plays a crucial role in clinical practice of respiratory motion modeling. Following the simplified linear-motion hypothesis, existing approaches adopt optical flow-based models to interpolate intermediate frames. However, realistic respiratory motions should be nonlinear and quasi-periodic with specific frequencies. Intuited by this property, we resolve the temporal interpolation task from the frequency perspective, and propose a Fourier basis-guided Diffusion model, termed FB-Diff. Specifically, due to the regular motion discipline of respiration, physiological motion priors are introduced to describe general characteristics of temporal data distributions. Then a Fourier motion operator is elaborately devised to extract Fourier bases by incorporating physiological motion priors and case-specific spectral information in the feature space of Variational Autoencoder. Well-learned Fourier bases can better simulate respiratory motions with motion patterns of specific frequencies. Conditioned on starting and ending frames, the diffusion model further leverages well-learned Fourier bases via the basis interaction operator, which promotes the temporal interpolation task in a generative manner. Extensive results demonstrate that FB-Diff achieves state-of-the-art (SOTA) perceptual performance with better temporal consistency while maintaining promising reconstruction metrics. Codes are available.