🤖 AI Summary
To address degraded multi-object detection and tracking performance in 360° omnidirectional cycling videos—caused by severe distortion, densely packed small objects, and discontinuous object boundaries—this paper proposes an end-to-end solution. First, a quad-view spherical projection strategy is designed to enhance robustness in detecting small objects. Second, DeepSORT is improved with boundary continuity modeling and class-aware association to mitigate trajectory fragmentation and misassociation. Third, we introduce the first overtaking behavior recognition model built upon omnidirectional tracking trajectories. Evaluated on a newly constructed panoramic cycling dataset, our method achieves a significant improvement in detection AP, with MOTA and IDF1 increased by 7.6% and 9.7%, respectively; overtaking detection attains an F-score of 0.88. The complete source code is fully open-sourced to ensure reproducibility.
📝 Abstract
Panoramic cycling videos can record 360{deg} views around the cyclists. Thus, it is essential to conduct automatic road user analysis on them using computer vision models to provide data for studies on cycling safety. However, the features of panoramic data such as severe distortions, large number of small objects and boundary continuity have brought great challenges to the existing CV models, including poor performance and evaluation methods that are no longer applicable. In addition, due to the lack of data with annotations, it is not easy to re-train the models. In response to these problems, the project proposed and implemented a three-step methodology: (1) improve the prediction performance of the pre-trained object detection models on panoramic data by projecting the original image into 4 perspective sub-images; (2) introduce supports for boundary continuity and category information into DeepSORT, a commonly used multiple object tracking model, and set an improved detection model as its detector; (3) using the tracking results, develop an application for detecting the overtaking behaviour of the surrounding vehicles. Evaluated on the panoramic cycling dataset built by the project, the proposed methodology improves the average precision of YOLO v5m6 and Faster RCNN-FPN under any input resolution setting. In addition, it raises MOTA and IDF1 of DeepSORT by 7.6% and 9.7% respectively. When detecting the overtakes in the test videos, it achieves the F-score of 0.88. The code is available on GitHub at github.com/cuppp1998/360_object_tracking to ensure the reproducibility and further improvements of results.