Taming Cold Starts: Proactive Serverless Scheduling with Model Predictive Control

📅 2025-08-11
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address high tail latency caused by serverless cold starts, this paper proposes the first model predictive control (MPC)-based joint scheduling framework that jointly optimizes container warm-up strategies and request dispatching. The framework integrates time-series invocation forecasting, MPC-driven dynamic decision-making, and a lightweight container warm-up mechanism, and is implemented within Apache OpenWhisk and deployed on Kubernetes. Its key contribution lies in being the first to apply MPC to serverless resource orchestration—enabling coordinated optimization of warm-up timing and load distribution—thereby simultaneously reducing resource overhead and mitigating tail latency. Experimental evaluation demonstrates that, compared to state-of-the-art approaches, the framework achieves up to an 85% reduction in tail latency and a 34% decrease in resource consumption.

Technology Category

Application Category

📝 Abstract
Serverless computing has transformed cloud application deployment by introducing a fine-grained, event-driven execution model that abstracts away infrastructure management. Its on-demand nature makes it especially appealing for latency-sensitive and bursty workloads. However, the cold start problem, i.e., where the platform incurs significant delay when provisioning new containers, remains the Achilles' heel of such platforms. This paper presents a predictive serverless scheduling framework based on Model Predictive Control to proactively mitigate cold starts, thereby improving end-to-end response time. By forecasting future invocations, the controller jointly optimizes container prewarming and request dispatching, improving latency while minimizing resource overhead. We implement our approach on Apache OpenWhisk, deployed on a Kubernetes-based testbed. Experimental results using real-world function traces and synthetic workloads demonstrate that our method significantly outperforms state-of-the-art baselines, achieving up to 85% lower tail latency and a 34% reduction in resource usage.
Problem

Research questions and friction points this paper is trying to address.

Mitigate cold starts in serverless computing platforms
Optimize container prewarming and request dispatching proactively
Improve end-to-end response time and reduce resource usage
Innovation

Methods, ideas, or system contributions that make the work stand out.

Model Predictive Control for proactive scheduling
Forecasting invocations to optimize container prewarming
Joint optimization of latency and resource usage
🔎 Similar Papers
No similar papers found.