🤖 AI Summary
To address the high cold-start overhead in serverless computing and the prohibitive cost and unreliability of existing prewarming strategies under bursty workloads, this paper proposes the Universal Workers paradigm. Our method leverages the long-tail workload skew inherent in real-world FaaS executions to construct locality-aware groups and introduces a three-tier caching architecture—spanning function handlers, runtime environments, and module imports—to enable on-demand, instantaneous reuse of function instances. Unlike conventional full prewarming, Universal Workers eliminates unnecessary warm-up by combining lightweight runtime reuse with locality-aware scheduling, fundamentally mitigating cold starts. Evaluation results show an 87% reduction in P99 cold-start latency, significantly improving resource utilization and throughput under bursty loads, as well as enhancing execution stability for multi-function, high-concurrency scenarios.
📝 Abstract
Serverless computing enables developers to deploy code without managing infrastructure, but suffers from cold start overhead when initializing new function instances. Existing solutions such as"keep-alive"or"pre-warming"are costly and unreliable under bursty workloads. We propose universal workers, which are computational units capable of executing any function with minimal initialization overhead. Based on an analysis of production workload traces, our key insight is that requests in Function-as-a-Service (FaaS) platforms show a highly skewed distribution, with most requests invoking a small subset of functions. We exploit this observation to approximate universal workers through locality groups and three-tier caching (handler, install, import). With this work, we aim to enable more efficient and scalable FaaS platforms capable of handling diverse workloads with minimal initialization overhead.