🤖 AI Summary
This paper addresses a systematic mismatch between the standard point-source object (SPO) model and pedestrian detection outputs (2D bounding boxes) in model-based multi-object visual tracking. To resolve this, we propose a bounding-box-observation-aware modeling refinement: the bounding box center is modeled as a continuous-time point target, integrated with a Poisson multi-Bernoulli mixture (PMBM) filter for posterior density estimation. The framework jointly infers target birth/survival probabilities, motion parameters, and observation uncertainty. Parameter learning and validation are conducted on the MOT-17 dataset, yielding competitive tracking performance. Crucially, this work quantitatively identifies— for the first time—the primary sources of bias induced by the SPO assumption in bounding-box tracking. By explicitly linking modeling assumptions to empirical detection characteristics, our analysis provides both theoretical grounding and empirical evidence for developing general-purpose tracking dynamic models better aligned with visual detection outputs.
📝 Abstract
This paper uses multi-object tracking methods known from the radar tracking community to address the problem of pedestrian tracking using 2D bounding box detections. The standard point-object (SPO) model is adopted, and the posterior density is computed using the Poisson multi-Bernoulli mixture (PMBM) filter. The selection of the model parameters rooted in continuous time is discussed, including the birth and survival probabilities. Some parameters are selected from the first principles, while others are identified from the data, which is, in this case, the publicly available MOT-17 dataset. Although the resulting PMBM algorithm yields promising results, a mismatch between the SPO model and the data is revealed. The model-based approach assumes that modifying the problematic components causing the SPO model-data mismatch will lead to better model-based algorithms in future developments.