🤖 AI Summary
4D medical image interpolation suffers from poor generalization due to distribution shift across domains and lacks ground-truth labels for adaptive inference. Method: We propose the first unsupervised test-time training (TTT) framework specifically designed for this task. Our approach dynamically adapts a frame interpolation network on a single test video using self-supervised signals—namely, rotation prediction and image reconstruction—enabling label-free, online domain adaptation. Contribution/Results: The paradigm is extensible to downstream tasks such as segmentation and registration. Evaluated on Cardiac and 4D-Lung datasets, our method achieves PSNRs of 33.73 dB and 34.02 dB, respectively—outperforming state-of-the-art methods significantly. These results demonstrate strong cross-distribution robustness and clinical applicability.
📝 Abstract
4D medical image interpolation is essential for improving temporal resolution and diagnostic precision in clinical applications. Previous works ignore the problem of distribution shifts, resulting in poor generalization under different distribution. A natural solution would be to adapt the model to a new test distribution, but this cannot be done if the test input comes without a ground truth label. In this paper, we propose a novel test time training framework which uses self-supervision to adapt the model to a new distribution without requiring any labels. Indeed, before performing frame interpolation on each test video, the model is trained on the same instance using a self-supervised task, such as rotation prediction or image reconstruction. We conduct experiments on two publicly available 4D medical image interpolation datasets, Cardiac and 4D-Lung. The experimental results show that the proposed method achieves significant performance across various evaluation metrics on both datasets. It achieves higher peak signal-to-noise ratio values, 33.73dB on Cardiac and 34.02dB on 4D-Lung. Our method not only advances 4D medical image interpolation but also provides a template for domain adaptation in other fields such as image segmentation and image registration.