🤖 AI Summary
This work addresses the challenge of confounding head motion with swallowing dynamics in videofluoroscopic swallowing studies, which compromises quantitative analysis accuracy. We propose a markerless tracking–driven framework for motion separation and image registration. For the first time, sparse markerless tracking models—CoTracker, PIPs++, and TAP-Net—are integrated into swallowing image correction. The framework combines robust optical flow analysis, multi-model tracking point evaluation, and ROI-guided deformation field modeling to generate high-fidelity, physiologically consistent deformation fields. Evaluated on real clinical X-ray data, our method achieves a 37% improvement in head-motion suppression over state-of-the-art methods (ANTs, LDDMM, VoxelMorph), a 22% gain in dynamic structural fidelity (measured by SSIM), and a 19% reduction in deformation field error (MSE). These advances significantly enhance the reliability and interpretability of quantitative swallowing function analysis.
📝 Abstract
Our study focuses on isolating swallowing dynamics from interfering patient motion in videofluoroscopy, an X-ray technique that records patients swallowing a radiopaque bolus. These recordings capture multiple motion sources, including head movement, anatomical displacements, and bolus transit. To enable precise analysis of swallowing physiology, we aim to eliminate distracting motion, particularly head movement, while preserving essential swallowing-related dynamics. Optical flow methods fail due to artifacts like flickering and instability, making them unreliable for distinguishing different motion groups. We evaluated markerless tracking approaches (CoTracker, PIPs++, TAP-Net) and quantified tracking accuracy in key medical regions of interest. Our findings show that even sparse tracking points generate morphing displacement fields that outperform leading registration methods such as ANTs, LDDMM, and VoxelMorph. To compare all approaches, we assessed performance using MSE and SSIM metrics post-registration. We introduce a novel motion correction pipeline that effectively removes disruptive motion while preserving swallowing dynamics and surpassing competitive registration techniques. Code will be available after review.