🤖 AI Summary
Deep learning-based medical image registration is hindered in clinical deployment by its heavy reliance on large-scale annotated training data. To address this, we propose a novel training-free test-time optimization paradigm. Our key innovation is the first integration of a frozen DINOv3 self-supervised vision encoder into medical registration, enabling iterative deformation field optimization directly within a compact semantic feature space. By eliminating conventional supervised training, our method achieves plug-and-play clinical adaptability. Evaluated on abdominal MR–CT and cardiac MRI datasets, it achieves Dice scores of 0.790 and 0.769, respectively, while significantly reducing Hausdorff distance (HD95) and Log-Jacobian determinant standard deviation—demonstrating superior anatomical plausibility, registration accuracy, and cross-modality generalizability.
📝 Abstract
Prior medical image registration approaches, particularly learning-based methods, often require large amounts of training data, which constrains clinical adoption. To overcome this limitation, we propose a training-free pipeline that relies on a frozen DINOv3 encoder and test-time optimization of the deformation field in feature space. Across two representative benchmarks, the method is accurate and yields regular deformations. On Abdomen MR-CT, it attained the best mean Dice score (DSC) of 0.790 together with the lowest 95th percentile Hausdorff Distance (HD95) of 4.9+-5.0 and the lowest standard deviation of Log-Jacobian (SDLogJ) of 0.08+-0.02. On ACDC cardiac MRI, it improves mean DSC to 0.769 and reduces SDLogJ to 0.11 and HD95 to 4.8, a marked gain over the initial alignment. The results indicate that operating in a compact foundation feature space at test time offers a practical and general solution for clinical registration without additional training.