🤖 AI Summary
This work addresses the high memory and CPU overhead in serverless computing caused by redundant loading of communication components across function instances, which limits deployment density. The authors propose a lightweight virtualization architecture customized atop KVM that transparently intercepts I/O operations at the API boundary and offloads them to a host-shared backend. By leveraging zero-copy shared memory, the design achieves a structured decoupling of computation and I/O while preserving full programming model compatibility. The system further introduces novel asynchronous I/O optimizations, including input prefetching and overlapping snapshot recovery with execution. Experimental results demonstrate that, compared to a production baseline, the approach reduces node-level CPU and memory consumption by 44% and 31%, respectively, increases deployment density by 37%, and decreases cold and warm startup latencies by 10% and 39%.
📝 Abstract
Serverless computing relies on extreme multi-tenancy to remain economically viable, driving providers to rely on virtual machines (VMs) that ensure strong isolation and seamless ecosystem compatibility with the FaaS programming model. However, current architectures tightly couple application processing logic with I/O processing, forcing every VM to duplicate a heavy communication fabric (cloud SDK, RPC, and TCP/IP). Our analysis reveals this duplication consumes over 25% of a function's memory footprint, and may double the CPU cycles in VMs compared to bare-metal execution. While prior systems attempt to solve this using WebAssembly or library OSes, they naively sacrifice ecosystem compatibility, forcing developers to migrate code and dependencies to new languages.
We introduce Nexus, a serverless-native KVM-based hypervisor that transparently decouples compute from I/O. Nexus shifts the execution model by intercepting communication fabric at the API boundary and offloading it to an always-on host shared backend via zero-copy shared memory. This removes the heavyweight communication fabric from the guest VM, while preserving the conventional serverless programming model. By structurally separating these domains, Nexus unlocks asynchronous I/O optimizations: overlapping input payload prefetching with VM restoration from a snapshot and writing output payloads back to storage off the critical path. Compared to the production baseline, Nexus reduces overall node-level CPU and memory consumption by up to 44% and 31%, respectively, thus increasing deployment density by 37%. Also, Nexus reduces warm- and cold-start latency by 39% and 10%, respectively, bringing the response time within 20% of that of a WASM-based, ecosystem-incompatible hypervisor.