Head Anchor Enhanced Detection and Association for Crowded Pedestrian Tracking

📅 2025-08-07
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address trajectory fragmentation and feature loss in pedestrian tracking under dense, heavily occluded scenarios, this paper proposes a novel method integrating head keypoint anchoring with 3D structural prior–enhanced motion modeling. Our approach tackles the limitations of bounding-box–based tracking by (1) employing robust head keypoints as occlusion-resilient local anchors, replacing failure-prone full-body bounding boxes; (2) designing an iterative Kalman filter that explicitly incorporates human 3D skeletal priors—tailored to modern detector output distributions—to improve motion prediction accuracy; and (3) jointly leveraging multi-source features: detection classification/regression logits, Re-ID appearance embeddings, and head keypoint representations for complementary information fusion. Evaluated on multiple crowded benchmarks, our method achieves significant improvements in IDF1 and MOTA, demonstrating superior trajectory stability and association accuracy over conventional constant-velocity models and full-body bounding-box baselines.

Technology Category

Application Category

📝 Abstract
Visual pedestrian tracking represents a promising research field, with extensive applications in intelligent surveillance, behavior analysis, and human-computer interaction. However, real-world applications face significant occlusion challenges. When multiple pedestrians interact or overlap, the loss of target features severely compromises the tracker's ability to maintain stable trajectories. Traditional tracking methods, which typically rely on full-body bounding box features extracted from {Re-ID} models and linear constant-velocity motion assumptions, often struggle in severe occlusion scenarios. To address these limitations, this work proposes an enhanced tracking framework that leverages richer feature representations and a more robust motion model. Specifically, the proposed method incorporates detection features from both the regression and classification branches of an object detector, embedding spatial and positional information directly into the feature representations. To further mitigate occlusion challenges, a head keypoint detection model is introduced, as the head is less prone to occlusion compared to the full body. In terms of motion modeling, we propose an iterative Kalman filtering approach designed to align with modern detector assumptions, integrating 3D priors to better complete motion trajectories in complex scenes. By combining these advancements in appearance and motion modeling, the proposed method offers a more robust solution for multi-object tracking in crowded environments where occlusions are prevalent.
Problem

Research questions and friction points this paper is trying to address.

Improving crowded pedestrian tracking under occlusion
Enhancing feature representation with head keypoints
Robust motion modeling using iterative Kalman filtering
Innovation

Methods, ideas, or system contributions that make the work stand out.

Uses head keypoint detection to reduce occlusion impact
Integrates regression and classification features for richer representation
Employs iterative Kalman filtering with 3D priors
🔎 Similar Papers
No similar papers found.
Z
Zewei Wu
Macao Polytechnic University, Macao SAR, China
C
César Teixeira
University of Coimbra, Coimbra 3004-531, Portugal
Wei Ke
Wei Ke
Xi'an Jiaotong University
Computer Vision and Deep Learning
Z
Zhang Xiong
Beihang University, Beijing 100191, China