🤖 AI Summary
To address high end-to-end latency, poor deadline adherence, and low throughput in DNN task offloading from resource-constrained mobile edge devices under heavy load, this paper proposes a lightweight real-time scheduling framework. Our method introduces: (1) a low-latency DNN task abstraction model coupled with a dynamic bandwidth estimation algorithm for fine-grained communication cost modeling; and (2) a joint scheduling strategy integrating device resource availability, discretized network state representation, and priority-aware preemption. Evaluated on a Raspberry Pi-based edge cluster, the framework reduces average end-to-end latency by 38.2%, significantly improves deadline satisfaction rate and task throughput, and demonstrates practical efficacy and deployability in a real-world industrial sorting scenario.
📝 Abstract
In this paper, we present a solution for low-latency deadline-constrained DNN offloading on mobile edge devices. We design a scheduling algorithm with lightweight network state representation, considering device availability, communication on the network link, priority-aware pre-emption, and task deadlines. The scheduling algorithm aims to reduce latency by designing a resource availability representation, as well as a network discretisation and a dynamic bandwidth estimation mechanism. We implement the scheduling algorithm into a system composed of four Raspberry Pi 2 (model Bs) mobile edge devices, sampling a waste classification conveyor belt at a set frame rate. The system is evaluated and compared to a previous approach of ours, which was proven to outcompete work-stealers and a non-pre-emption based scheduling heuristic under the aforementioned waste classification scenario. Our findings show the novel lower latency abstraction models yield better performance under high-volume workloads, with the dynamic bandwidth estimation assisting the task placement while, ultimately, increasing task throughput in times of resource scarcity.