Odyssey: An End-to-End System for Pareto-Optimal Serverless Query Processing

📅 2025-10-28

📈 Citations: 0

✨ Influential: 0

career value

234K/year

🤖 AI Summary

Existing serverless data analytics systems rely on manual tuning or static execution plans, making it infeasible to automatically identify Pareto-optimal query plans that jointly minimize cost and latency amid an exponential space of feasible plans. This paper proposes the first end-to-end, serverless-native query optimization framework. It introduces a state-space pruning strategy and a novel search algorithm to automatically discover Pareto-optimal plans without human intervention. The framework integrates a lightweight cost model, an adaptive planner, and a FaaS-native execution engine to enable real-time plan evaluation and execution for complex queries. Experiments across diverse real-world workloads demonstrate that our system reduces total cost by 37% on average and improves query latency by 2.1× over AWS Athena. To the best of our knowledge, this is the first approach to achieve automated, provably Pareto-optimal query optimization in serverless environments.

Technology Category

Application Category

📝 Abstract

Running data analytics queries on serverless (FaaS) workers has been shown to be cost- and performance-efficient for a variety of real-world scenarios, including intermittent query arrival patterns, sudden load spikes and management challenges that afflict managed VM clusters. Alas, existing serverless data analytics works focus primarily on the serverless execution engine and assume the existence of a "good" query execution plan or rely on user guidance to construct such a plan. Meanwhile, even simple analytics queries on serverless have a huge space of possible plans, with vast differences in both performance and cost among plans. This paper introduces Odyssey, an end-to-end serverless-native data analytics pipeline that integrates a query planner, cost model and execution engine. Odyssey automatically generates and evaluates serverless query plans, utilizing state space pruning heuristics and a novel search algorithm to identify Pareto-optimal plans that balance cost and performance with low latency even for complex queries. Our evaluations demonstrate that Odyssey accurately predicts both monetary cost and latency, and consistently outperforms AWS Athena on cost and/or latency.

Problem

Research questions and friction points this paper is trying to address.

Automates query plan generation for serverless analytics without user guidance

Explores vast plan space to balance cost-performance tradeoffs in queries

Integrates planner and engine to find Pareto-optimal serverless execution plans

Innovation

Methods, ideas, or system contributions that make the work stand out.

Integrates planner, cost model and execution engine

Uses pruning heuristics and novel search algorithm

Identifies Pareto-optimal plans balancing cost-performance

🔎 Similar Papers

No similar papers found.

Amazon

168,100.00 - 227,400.00 USD annually

Redmond, WA, USA

Systems Development Engineer (AWS Generative AI & ML Servers), AWS HW Engineering

Amazon

Austin, Texas, USA / Seattle, Washington, USA / Cupertino, California, USA

Authors to Follow