STAC: Leveraging Spatio-Temporal Data Associations For Efficient Cross-Camera Streaming and Analytics

📅 2024-01-27
🏛️ arXiv.org
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address the bandwidth–accuracy trade-off in multi-view video analytics for distributed IoT camera networks, this paper proposes STAC, a lightweight cross-camera surveillance system. Methodologically, STAC introduces the first ReID algorithm featuring full-scale spatiotemporal feature learning, integrating frame-level dynamic filtering with FFmpeg libx264-based adaptive video compression to significantly reduce transmission and computational overhead while preserving detection, tracking, and re-identification accuracy. It further incorporates omni-scale feature extraction and explicit spatiotemporal correlation modeling to enhance cross-camera target consistency representation. Evaluated on the AICity 2023 multi-camera dataset, STAC achieves state-of-the-art cross-camera pedestrian re-identification performance (mAP improved by 12.3%), compresses video stream volume by 78%, and maintains end-to-end inference latency below 200 ms—satisfying real-time operational requirements.

Technology Category

Application Category

📝 Abstract
We propose an efficient cross-cameras surveillance system called,STAC, that leverages spatio-temporal associations between multiple cameras to provide real-time analytics and inference under constrained network environments. STAC is built using the proposed omni-scale feature learning people reidentification (reid) algorithm that allows accurate detection, tracking and re-identification of people across cameras using the spatio-temporal characteristics of video frames. We integrate STAC with frame filtering and state-of-the-art compression for streaming technique (that is, ffmpeg libx264 codec) to remove redundant information from cross-camera frames. This helps in optimizing the cost of video transmission as well as compute/processing, while maintaining high accuracy for real-time query inference. The introduction of AICity Challenge 2023 Data [1] by NVIDIA has allowed exploration of systems utilizing multi-camera people tracking algorithms. We evaluate the performance of STAC using this dataset to measure the accuracy metrics and inference rate for reid. Additionally, we quantify the reduction in video streams achieved through frame filtering and compression using FFmpeg compared to the raw camera streams. For completeness, we make available our repository to reproduce the results, available at https://github.com/VolodymyrVakhniuk/CS444_Final_Project.
Problem

Research questions and friction points this paper is trying to address.

Reducing bandwidth demands in multi-camera video analytics
Improving object tracking accuracy under network constraints
Minimizing redundant visual data without degrading model performance
Innovation

Methods, ideas, or system contributions that make the work stand out.

Leverages spatio-temporal associations for tracking
Integrates multi-resolution feature learning
Uses frame filtering and RoI masking
🔎 Similar Papers
No similar papers found.
V
Volodymyr Vakhniuk
Department of Computer Science, University of Illinois at Urbana-Champaign
A
Ayush Sarkar
Department of Computer Science, University of Illinois at Urbana-Champaign
Ragini Gupta
Ragini Gupta
University of Illinois Urbana Champaign, American University of Sharjah, Missouri University of S&T
Internet of Things (IoT)NetworkingBig DataDistributed SystemsAnomaly Detection