SAM2MOT: A Novel Paradigm of Multi-Object Tracking by Segmentation

📅 2025-04-06

📈 Citations: 0

✨ Influential: 0

career value

185K/year

🤖 AI Summary

Multi-object tracking (MOT) suffers from heavy reliance on detection accuracy and poor generalization across datasets. To address this, we propose a novel “Tracking by Segmentation” paradigm that bypasses conventional detection-driven pipelines and directly generates tracking bounding boxes from segmentation masks. Our method builds upon the SAM2 architecture and incorporates a trajectory manager, a cross-object interaction module, and a mask-to-bounding-box mapping mechanism to enable end-to-end, zero-shot cross-dataset tracking. The core innovation lies in segmentation-driven tracking modeling, which significantly improves robustness to occlusion and enhances modeling of object lifecycles. Extensive experiments demonstrate state-of-the-art performance on DanceTrack, UAVDT, and BDD100K: on DanceTrack, our approach achieves +2.1% HOTA and +4.5% IDF1 over prior art.

Technology Category

Application Category

📝 Abstract

Segment Anything 2 (SAM2) enables robust single-object tracking using segmentation. To extend this to multi-object tracking (MOT), we propose SAM2MOT, introducing a novel Tracking by Segmentation paradigm. Unlike Tracking by Detection or Tracking by Query, SAM2MOT directly generates tracking boxes from segmentation masks, reducing reliance on detection accuracy. SAM2MOT has two key advantages: zero-shot generalization, allowing it to work across datasets without fine-tuning, and strong object association, inherited from SAM2. To further improve performance, we integrate a trajectory manager system for precise object addition and removal, and a cross-object interaction module to handle occlusions. Experiments on DanceTrack, UAVDT, and BDD100K show state-of-the-art results. Notably, SAM2MOT outperforms existing methods on DanceTrack by +2.1 HOTA and +4.5 IDF1, highlighting its effectiveness in MOT.

Problem

Research questions and friction points this paper is trying to address.

Extends SAM2 to multi-object tracking via segmentation

Reduces reliance on detection accuracy in MOT

Achieves zero-shot generalization across datasets

Innovation

Methods, ideas, or system contributions that make the work stand out.

Uses segmentation masks for tracking boxes

Integrates zero-shot generalization capability

Includes trajectory manager for object handling

🔎 Similar Papers

No similar papers found.