🤖 AI Summary
Tracking fast-moving objects (FMOs) remains a challenging problem in computer vision. This paper introduces FMOX—the first standardized JSON metadata framework specifically designed for FMO tracking—unifying four open-source datasets and enriching annotations with fine-grained attributes such as object size, thereby enabling consistent cross-dataset evaluation. Leveraging FMOX, we systematically benchmark EfficientTAM, an efficient foundation model, and demonstrate its competitive performance against task-specific traditional pipelines under the trajectory intersection over union (TIoU) metric, validating the promise of foundation models for FMO tracking. Key contributions include: (1) the first structured metadata specification for FMO image sequences; (2) the FMOX annotation schema, enhanced with object-scale information; and (3) full open-sourcing of code and annotations, significantly improving reproducibility and fair model comparison in small-object tracking research.
📝 Abstract
Fast and tiny object tracking remains a challenge in computer vision and in this paper we first introduce a JSON metadata file associated with four open source datasets of Fast Moving Objects (FMOs) image sequences. In addition, we extend the description of the FMOs datasets with additional ground truth information in JSON format (called FMOX) with object size information. Finally we use our FMOX file to test a recently proposed foundational model for tracking (called EfficientTAM) showing that its performance compares well with the pipelines originally taylored for these FMO datasets. Our comparison of these state-of-the-art techniques on FMOX is provided with Trajectory Intersection of Union (TIoU) scores. The code and JSON is shared open source allowing FMOX to be accessible and usable for other machine learning pipelines aiming to process FMO datasets.