🤖 AI Summary
This work addresses the challenge of long-term failure in planar object tracking caused by drastic appearance changes or occlusions. We propose a novel approach that integrates SAM 2 segmentation with 8-degree-of-freedom homography estimation. For the first time, SAM 2 is introduced into re-detection for planar tracking, and we innovatively estimate homography directly from the contour of segmentation masks, significantly enhancing robustness. By incorporating this re-detection mechanism into an improved WOFT framework, our method achieves state-of-the-art performance, outperforming the second-best method by 12.4 and 15.2 percentage points in terms of p@15 on the POT-210 and PlanarTrack benchmarks, respectively. Additionally, we provide more accurate initial pose annotations for the PlanarTrack dataset.
📝 Abstract
We present SAM-H and WOFTSAM, novel planar trackers that combine robust long-term segmentation tracking provided by SAM 2 with 8 degrees-of-freedom homography pose estimation. SAM-H estimates homographies from segmentation mask contours and is thus highly robust to target appearance changes. WOFTSAM significantly improves the current state-of-the-art planar tracker WOFT by exploiting lost target re-detection provided by SAM-H. The proposed methods are evaluated on POT-210 and PlanarTrack tracking benchmarks, setting the new state-of-the-art performance on both. On the latter, they outperform the second best by a large margin, +12.4 and +15.2pp on the p@15 metric. We also present improved ground-truth annotations of initial PlanarTrack poses, enabling more accurate benchmarking in the high-precision p@5 metric. The code and the re-annotations are available at https://github.com/serycjon/WOFTSAM