Depth-Aware Scoring and Hierarchical Alignment for Multiple Object Tracking

📅 2025-06-01

📈 Citations: 0

✨ Influential: 0

career value

187K/year

🤖 AI Summary

To address association errors in multi-object tracking (MOT) caused by occlusion and appearance similarity, this paper proposes a training-free, depth-aware tracking framework. Methodologically, it introduces monocular depth estimation—performed in a zero-shot manner—as an independent geometric cue into the data association stage, yielding 3D spatial priors without supervision. A parameter-free hierarchical alignment scoring mechanism is designed, jointly optimizing coarse-grained IoU-based spatial matching and fine-grained pixel-level appearance alignment to enable synergistic geometric-appearance modeling. Additionally, unsupervised motion modeling is integrated to further improve robustness under dynamic scenes. The framework achieves state-of-the-art performance on challenging benchmarks including MOT17 and MOT20, with no training or fine-tuning required at any stage. All code is publicly released.

Technology Category

Application Category

📝 Abstract

Current motion-based multiple object tracking (MOT) approaches rely heavily on Intersection-over-Union (IoU) for object association. Without using 3D features, they are ineffective in scenarios with occlusions or visually similar objects. To address this, our paper presents a novel depth-aware framework for MOT. We estimate depth using a zero-shot approach and incorporate it as an independent feature in the association process. Additionally, we introduce a Hierarchical Alignment Score that refines IoU by integrating both coarse bounding box overlap and fine-grained (pixel-level) alignment to improve association accuracy without requiring additional learnable parameters. To our knowledge, this is the first MOT framework to incorporate 3D features (monocular depth) as an independent decision matrix in the association step. Our framework achieves state-of-the-art results on challenging benchmarks without any training nor fine-tuning. The code is available at https://github.com/Milad-Khanchi/DepthMOT

Problem

Research questions and friction points this paper is trying to address.

Improves object association in MOT using depth-aware features

Addresses occlusion issues with zero-shot depth estimation

Enhances accuracy via Hierarchical Alignment Score without training

Innovation

Methods, ideas, or system contributions that make the work stand out.

Zero-shot depth estimation for 3D features

Hierarchical Alignment Score combining coarse-fine alignment

Depth-aware association without learnable parameters

🔎 Similar Papers

No similar papers found.