🤖 AI Summary
This work addresses the trade-off between cold-start latency and carbon emissions from idle keep-alive instances in serverless computing, which is further complicated by time-varying grid carbon intensity and fluctuating workloads, rendering traditional static keep-alive strategies inefficient. To tackle this challenge, the paper presents the first unified framework that integrates carbon-awareness and delay sensitivity into dynamic keep-alive decisions, formulating the problem as a sequential decision-making process. The authors propose an adaptive scheduling approach based on deep reinforcement learning to jointly optimize cold-start probability, latency cost, and carbon emissions in real time. Experiments on real-world traces from Huawei Cloud demonstrate that, compared to static strategies, the proposed method reduces cold starts by 51.69% and idle keep-alive carbon emissions by 77.08%, significantly outperforming existing heuristic and single-objective approaches in balancing latency and carbon efficiency while approaching Oracle-level performance.
📝 Abstract
Serverless computing simplifies cloud deployment but introduces new challenges in managing service latency and carbon emissions. Reducing cold-start latency requires retaining warm function instances, while minimizing carbon emissions favors reclaiming idle resources. This balance is further complicated by time-varying grid carbon intensity and varying workload patterns, under which static keep-alive policies are inefficient. We present LACE-RL, a latency-aware and carbon-efficient management framework that formulates serverless pod retention as a sequential decision problem. LACE-RL uses deep reinforcement learning to dynamically tune keep-alive durations, jointly modeling cold-start probability, function-specific latency costs, and real-time carbon intensity. Using the Huawei Public Cloud Trace, we show that LACE-RL reduces cold starts by 51.69% and idle keep-alive carbon emissions by 77.08% compared to Huawei's static policy, while achieving better latency-carbon trade-offs than state-of-the-art heuristic and single-objective baselines, approaching Oracle performance.