Depth-Aware Scoring and Hierarchical Alignment for Multiple Object Tracking

📅 2025-06-01
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address association errors in multi-object tracking (MOT) caused by occlusion and appearance similarity, this paper proposes a training-free, depth-aware tracking framework. Methodologically, it introduces monocular depth estimation—performed in a zero-shot manner—as an independent geometric cue into the data association stage, yielding 3D spatial priors without supervision. A parameter-free hierarchical alignment scoring mechanism is designed, jointly optimizing coarse-grained IoU-based spatial matching and fine-grained pixel-level appearance alignment to enable synergistic geometric-appearance modeling. Additionally, unsupervised motion modeling is integrated to further improve robustness under dynamic scenes. The framework achieves state-of-the-art performance on challenging benchmarks including MOT17 and MOT20, with no training or fine-tuning required at any stage. All code is publicly released.

Technology Category

Application Category

📝 Abstract
Current motion-based multiple object tracking (MOT) approaches rely heavily on Intersection-over-Union (IoU) for object association. Without using 3D features, they are ineffective in scenarios with occlusions or visually similar objects. To address this, our paper presents a novel depth-aware framework for MOT. We estimate depth using a zero-shot approach and incorporate it as an independent feature in the association process. Additionally, we introduce a Hierarchical Alignment Score that refines IoU by integrating both coarse bounding box overlap and fine-grained (pixel-level) alignment to improve association accuracy without requiring additional learnable parameters. To our knowledge, this is the first MOT framework to incorporate 3D features (monocular depth) as an independent decision matrix in the association step. Our framework achieves state-of-the-art results on challenging benchmarks without any training nor fine-tuning. The code is available at https://github.com/Milad-Khanchi/DepthMOT
Problem

Research questions and friction points this paper is trying to address.

Improves object association in MOT using depth-aware features
Addresses occlusion issues with zero-shot depth estimation
Enhances accuracy via Hierarchical Alignment Score without training
Innovation

Methods, ideas, or system contributions that make the work stand out.

Zero-shot depth estimation for 3D features
Hierarchical Alignment Score combining coarse-fine alignment
Depth-aware association without learnable parameters
🔎 Similar Papers
No similar papers found.
M
Milad Khanchi
Concordia University, Montreal, Quebec, Canada
M
Maria Amer
Concordia University, Montreal, Quebec, Canada
Charalambos Poullis
Charalambos Poullis
Immersive and Creative Technologies Lab, Department of Computer Science, Concordia University
Computer Vision/GraphicsVR|AR|MR