🤖 AI Summary
To address latency and computational constraints in high-resolution real-time object detection on edge computing platforms, this paper proposes an enhanced Canvas Attention Scheduling (CAS) mechanism. The method introduces variable-size canvas frames and an adaptive frame-rate scheduling strategy to overcome limitations of conventional fixed-configuration designs. Integrated with the YOLOv11 detection model and a dynamic region aggregation approach, the system achieves efficient processing of Waymo Open Dataset video streams on the NVIDIA Jetson Orin Nano. Experimental results demonstrate significant improvements in detection accuracy (mAP ↑) and recall (Recall ↑) under identical hardware conditions. The proposed approach optimizes the quality–resource trade-off, maintaining low end-to-end latency while enhancing robustness for critical task perception—particularly under dynamic scene complexity and resource variability.
📝 Abstract
Real-time perception on edge platforms faces a core challenge: executing high-resolution object detection under stringent latency constraints on limited computing resources. Canvas-based attention scheduling was proposed in earlier work as a mechanism to reduce the resource demands of perception subsystems. It consolidates areas of interest in an input data frame onto a smaller area, called a canvas frame, that can be processed at the requisite frame rate. This paper extends prior canvas-based attention scheduling literature by (i) allowing for variable-size canvas frames and (ii) employing selectable canvas frame rates that may depart from the original data frame rate. We evaluate our solution by running YOLOv11, as the perception module, on an NVIDIA Jetson Orin Nano to inspect video frames from the Waymo Open Dataset. Our results show that the additional degrees of freedom improve the attainable quality/cost trade-offs, thereby allowing for a consistently higher mean average precision (mAP) and recall with respect to the state of the art.