🤖 AI Summary
This work addresses the challenge of myocardial segmentation in echocardiographic videos, which is particularly hindered by low contrast, noise, anatomical variability, and poor temporal consistency—especially in low-quality data. To this end, the authors propose Point-Seg, a novel framework that, for the first time, explicitly incorporates point-tracking trajectories as motion-aware signals into the segmentation pipeline, providing pixel-level myocardial motion cues without relying on feature memory propagation. By integrating a Transformer-based segmentation architecture, a point-tracking module trained on synthetic data, and a temporal smoothing loss, Point-Seg achieves state-of-the-art performance on both public and private datasets. It not only matches or exceeds current accuracy in high-quality videos but also substantially improves segmentation fidelity and temporal stability in low-quality sequences, while effectively supporting downstream tasks such as myocardial strain analysis.
📝 Abstract
Purpose: Myocardium segmentation in echocardiography videos is a challenging task due to low contrast, noise, and anatomical variability. Traditional deep learning models either process frames independently, ignoring temporal information, or rely on memory-based feature propagation, which accumulates error over time. Methods: We propose Point-Seg, a transformer-based segmentation framework that integrates point tracking as a temporal cue to ensure stable and consistent segmentation of myocardium across frames. Our method leverages a point-tracking module trained on a synthetic echocardiography dataset to track key anatomical landmarks across video sequences. These tracked trajectories provide an explicit motion-aware signal that guides segmentation, reducing drift and eliminating the need for memory-based feature accumulation. Additionally, we incorporate a temporal smoothing loss to further enhance temporal consistency across frames. Results: We evaluate our approach on both public and private echocardiography datasets. Experimental results demonstrate that Point-Seg has statistically similar accuracy in terms of Dice to state-of-the-art segmentation models in high quality echo data, while it achieves better segmentation accuracy in lower quality echo with improved temporal stability. Furthermore, Point-Seg has the key advantage of pixel-level myocardium motion information as opposed to other segmentation methods. Such information is essential in the computation of other downstream tasks such as myocardial strain measurement and regional wall motion abnormality detection. Conclusion: Point-Seg demonstrates that point tracking can serve as an effective temporal cue for consistent video segmentation, offering a reliable and generalizable approach for myocardium segmentation in echocardiography videos. The code is available at https://github.com/DeepRCL/PointSeg.