TrEnv: Transparently Share Serverless Execution Environments Across Different Functions and Nodes

📅 2025-09-11
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Serverless platforms face high infrastructure overhead when executing emerging workloads such as LLM agents, where frequent cold starts and volatile resource demands inflate platform costs up to 70% of LLM API call expenses. To address this, we propose a high-density serverless execution framework enabling shared execution environments across functions and nodes. Our approach introduces (1) reusable sandboxes and memory templates for rapid environment restoration and sharing, and (2) in VM-based deployments, integrates browser sharing with page-cache bypassing to reduce system-level overhead. The framework unifies container and VM abstractions. Evaluation shows that, in containerized environments, it reduces P99 latency by 7× and cuts memory usage by 48%; in VM environments, it achieves 58% lower P99 latency and 61% memory savings—significantly outperforming state-of-the-art systems like E2B.

Technology Category

Application Category

📝 Abstract
Serverless computing provides dynamic scalability, but its infrastructure overhead becomes a bottleneck for emerging workloads such as LLM agents, which exhibit unpredictable invocation patterns and variable resource demands. Our analysis shows that for these agents, the cost of running on serverless platforms can reach up to 70% of the cost of LLM API calls. This finding motivates the need for a more efficient, high-density serverless platform. We present TrEnv, a co-designed serverless platform that supports both container- and VM-based environments, optimized for the unique demands of LLM agents. TrEnv reduces startup latency and memory usage through repurposable sandboxes and memory templates, which enable fast reuse and restoration of execution environments. To further reduce overhead in VM-based agent workloads, TrEnv leverages browser sharing and a page cache bypassing mechanism. Evaluations show that TrEnv reduces P99 latency by up to 7X and memory usage by 48% in container-based settings, and achieves up to 58% lower P99 latency and 61% memory savings for VM-based agents compared to state-of-the-art systems like E2B.
Problem

Research questions and friction points this paper is trying to address.

Reducing infrastructure overhead in serverless computing for LLM agents
Optimizing execution environments for unpredictable invocation patterns
Minimizing startup latency and memory usage in serverless platforms
Innovation

Methods, ideas, or system contributions that make the work stand out.

Repurposable sandboxes for fast environment reuse
Memory templates reducing startup latency usage
Browser sharing and page cache bypassing
🔎 Similar Papers
No similar papers found.
J
Jialiang Huang
Tsinghua University and Alibaba Group, China
T
Teng Ma
Alibaba Group, China
Z
Zheng Liu
Alibaba Group and Zhejiang University, China
S
Sixing Lin
Tsinghua University, China
K
Kang Chen
Tsinghua University, China
Jinlei Jiang
Jinlei Jiang
Department of Computer Science and Technology, Tsinghua University
Cloud ComputingBig DataGrid ComputingCSCW
X
Xia Liao
Tsinghua University, China
Y
Yingdi Shan
Tsinghua University, China
Y
Yongwei Wu
Tsinghua University, China
N
Ning Zhang
Alibaba Group, China
M
Mengting Lu
Alibaba Group, China
T
Tao Ma
Alibaba Group, China
H
Haifeng Gong
Intel, China
M
Mingxing Zhang
Tsinghua University, China