Galley: Modern Query Optimization for Sparse Tensor Programs

📅 2024-08-27
🏛️ arXiv.org
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
In sparse tensor programming, manually selecting computation orders and storage formats is complex and inefficient, hindering both developer productivity and performance portability. This paper introduces Galley, a declarative optimization framework that pioneers the adaptation of database query optimization principles to sparse tensor compilation. Galley employs a cost model to generate logical and physical execution plans, enabling cross-format optimization, multi-operator joint scheduling, and dynamic, format-aware operator fusion. It unifies sparse tensor algebra via a dedicated intermediate representation (IR) and interoperates with established backends such as TACO and SPARSITY. Experiments across machine learning workloads, subgraph counting, and iterative graph algorithms demonstrate that Galley matches or exceeds hand-tuned scheduling performance—achieving an average 2.3× speedup—while substantially lowering development complexity and enhancing cross-platform performance portability.

Technology Category

Application Category

📝 Abstract
The tensor programming abstraction has become a foundational paradigm for modern computing. This framework allows users to write high performance programs for bulk computation via a high-level imperative interface. Recent work has extended this paradigm to sparse tensors (i.e. tensors where most entries are not explicitly represented) with the use of sparse tensor compilers. These systems excel at producing efficient code for computation over sparse tensors, which may be stored in a wide variety of formats. However, they require the user to manually choose the order of operations and the data formats at every step. Unfortunately, these decisions are both highly impactful and complicated, requiring significant effort to manually optimize. In this work, we present Galley, a system for declarative sparse tensor programming. Galley performs cost-based optimization to lower these programs to a logical plan then to a physical plan. It then leverages sparse tensor compilers to execute the physical plan efficiently. We show that Galley achieves high performance on a wide variety of problems including machine learning algorithms, subgraph counting, and iterative graph algorithms.
Problem

Research questions and friction points this paper is trying to address.

Sparse Tensor Programming
Manual Decision-making
Machine Learning
Innovation

Methods, ideas, or system contributions that make the work stand out.

Sparse Tensor Optimization
Automatic Query Optimization
Machine Learning Applications
🔎 Similar Papers
No similar papers found.