🤖 AI Summary
Medical imaging temporal modeling faces two key challenges: (1) existing methods predominantly exploit only single-timepoint context, and (2) they lack support for fine-grained 3D spatial prediction. To address these, we propose Temporal Flow Matching (TFM), the first unified generative trajectory modeling framework capable of handling irregularly sampled, multi-prior, and 4D volumetric longitudinal data. TFM explicitly learns latent temporal distributions via time-aware encoding and conditional flow matching, and naturally degenerates to recent-image prediction—ensuring both theoretical consistency and practical utility. Evaluated on three public longitudinal medical imaging datasets, TFM establishes new state-of-the-art performance in 4D medical image forecasting, significantly outperforming prior spatiotemporal models. It demonstrates strong robustness to acquisition variability and cross-disease generalization. This work provides a reliable foundation for disease progression modeling, treatment planning, and developmental trajectory analysis.
📝 Abstract
Understanding temporal dynamics in medical imaging is crucial for applications such as disease progression modeling, treatment planning and anatomical development tracking. However, most deep learning methods either consider only single temporal contexts, or focus on tasks like classification or regression, limiting their ability for fine-grained spatial predictions. While some approaches have been explored, they are often limited to single timepoints, specific diseases or have other technical restrictions. To address this fundamental gap, we introduce Temporal Flow Matching (TFM), a unified generative trajectory method that (i) aims to learn the underlying temporal distribution, (ii) by design can fall back to a nearest image predictor, i.e. predicting the last context image (LCI), as a special case, and (iii) supports $3D$ volumes, multiple prior scans, and irregular sampling. Extensive benchmarks on three public longitudinal datasets show that TFM consistently surpasses spatio-temporal methods from natural imaging, establishing a new state-of-the-art and robust baseline for $4D$ medical image prediction.