🤖 AI Summary
Multi-tenant DNN inference in edge computing faces a tri-objective optimization challenge: meeting strict latency requirements, adhering to power constraints, and minimizing carbon emissions. This paper proposes a carbon-intensity-aware real-time scheduling framework. First, it introduces a novel mechanism that dynamically adjusts device power caps based on real-time grid carbon intensity. Second, it designs an online model degradation and replacement strategy for mixed-precision DNNs, balancing inference accuracy and energy efficiency. Third, it develops a Transformer-based workload mapping estimator to improve resource allocation accuracy. Evaluated on the Jetson AGX Xavier platform, the framework reduces carbon emissions by 30% and carbon-delay product (CDP) by 25% on average over baseline methods, while satisfying millisecond-scale latency and high energy efficiency. The solution provides a deployable, system-level approach for green edge intelligence.
📝 Abstract
Edge computing systems struggle to efficiently manage multiple concurrent deep neural network (DNN) workloads while meeting strict latency requirements, minimizing power consumption, and maintaining environmental sustainability. This paper introduces Ecomap, a sustainability-driven framework that dynamically adjusts the maximum power threshold of edge devices based on real-time carbon intensity. Ecomap incorporates the innovative use of mixed-quality models, allowing it to dynamically replace computationally heavy DNNs with lighter alternatives when latency constraints are violated, ensuring service responsiveness with minimal accuracy loss. Additionally, it employs a transformer-based estimator to guide efficient workload mappings. Experimental results using NVIDIA Jetson AGX Xavier demonstrate that Ecomap reduces carbon emissions by an average of 30% and achieves a 25% lower carbon delay product (CDP) compared to state-of-the-art methods, while maintaining comparable or better latency and power efficiency.