SEE++: Evolving Snowpark Execution Environment for Modern Workloads

📅 2025-11-16
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Existing Snowpark Execution Environments (SEEs) lack fine-grained, high-assurance sandbox isolation required by modern data engineering and AI/ML workloads. To address this, we propose a novel secure execution environment built upon gVisor. Our approach integrates a customized gVisor sandbox into Snowflake’s virtual warehouse nodes, combining lightweight virtualization, fine-grained resource governance, and kernel-level security hardening to enable safe, high-performance execution of multi-language runtimes—including Python. Compared to the native SEE, our architecture significantly strengthens isolation guarantees while improving runtime performance, scalability, and operational maintainability. Experimental evaluation and real-world deployment cases demonstrate that the design achieves strong security assurance without compromising compatibility with existing Snowpark APIs and workloads. This provides a robust, flexible, and production-ready execution foundation for next-generation Snowpark applications involving complex, heterogeneous, and security-sensitive data and ML pipelines.

Technology Category

Application Category

📝 Abstract
Snowpark enables Data Engineering and AI/ML workloads to run directly within Snowflake by deploying a secure sandbox on virtual warehouse nodes. This Snowpark Execution Environment (SEE) allows users to execute arbitrary workloads in Python and other languages in a secure and performant manner. As adoption has grown, the diversity of workloads has introduced increasingly sophisticated needs for sandboxing. To address these evolving requirements, Snowpark transitioned its in-house sandboxing solution to gVisor, augmented with targeted optimizations. This paper describes both the functional and performance objectives that guided the upgrade, outlines the new sandbox architecture, and details the challenges encountered during the journey, along with the solutions developed to resolve them. Finally, we present case studies that highlight new features enabled by the upgraded architecture, demonstrating SEE's extensibility and flexibility in supporting the next generation of Snowpark workloads.
Problem

Research questions and friction points this paper is trying to address.

Evolving sandboxing needs for diverse Snowpark workloads
Transitioning to gVisor with targeted performance optimizations
Enabling extensible architecture for next-generation AI workloads
Innovation

Methods, ideas, or system contributions that make the work stand out.

Transitioned sandboxing solution to gVisor
Augmented gVisor with targeted optimizations
Enhanced security for Python workloads execution
🔎 Similar Papers
No similar papers found.
G
Gaurav Jain
Snowflake, Inc
B
Brandon Baker
Snowflake, Inc
J
Joe Yin
Snowflake, Inc
C
Chenwei Xie
Snowflake, Inc
Zihao Ye
Zihao Ye
NVIDIA, University of Washington
CompilersMachine Learning Systems
S
Sidh Kulkarni
Snowflake, Inc
S
Sara Abdelrahman
Snowflake, Inc
N
Nova Qi
Snowflake, Inc
U
Urjeet Shrestha
Snowflake, Inc
M
Mike Halcrow
Snowflake, Inc
D
Dave Bailey
Snowflake, Inc
Y
Yuxiong He
Snowflake, Inc