🤖 AI Summary
To address the growing challenge of balancing efficiency and accuracy in DNN-based video analytics amid exponential video data growth, this paper pioneers an “efficiency-first” paradigm. We systematically survey and construct a cross-layer collaborative optimization framework encompassing hardware acceleration, inter-frame optimization, dynamic inference, model compression, and adaptive sampling. A unified taxonomy for efficiency optimization is proposed to characterize performance bottlenecks and key challenges across algorithmic, system, and hardware layers. We empirically evaluate state-of-the-art techniques along three dimensions: computational efficiency, resource utilization, and accuracy retention. Our principal contributions are: (1) establishing an efficiency-driven research paradigm; (2) enabling vertical co-design across algorithms, systems, and hardware; and (3) identifying promising future directions—including lightweight architectures, semantics-aware sampling, and heterogeneous compilation optimization—for scalable, real-time video analytics.
📝 Abstract
The explosive growth of video data in recent years has brought higher demands for video analytics, where accuracy and efficiency remain the two primary concerns. Deep neural networks (DNNs) have been widely adopted to ensure accuracy; however, improving their efficiency in video analytics remains an open challenge. Different from existing surveys that make summaries of DNN-based video mainly from the accuracy optimization aspect, in this survey, we aim to provide a thorough review of optimization techniques focusing on the improvement of the efficiency of DNNs in video analytics. We organize existing methods in a bottom-up manner, covering multiple perspectives such as hardware support, data processing, operational deployment, etc. Finally, based on the optimization framework and existing works, we analyze and discuss the problems and challenges in the performance optimization of DNN-based video analytics.