BARQ: A Vectorized SPARQL Query Execution Engine

📅 2025-04-06

📈 Citations: 0

✨ Influential: 0

career value

184K/year

🤖 AI Summary

SPARQL query performance on the Stardog knowledge graph platform suffers from severe CPU bottlenecks in CPU-intensive workloads, while the traditional Volcano execution model fails to simultaneously optimize for disk I/O-bound and OLTP-style queries. Method: This paper proposes BARQ—the first vectorized query execution framework designed for industrial-grade SPARQL engines. BARQ refactors critical operators (especially joins) using a batched vectorized paradigm and introduces a progressive integration mechanism enabling seamless deployment without modifying existing query interfaces. It further innovates with a CPU/disk-bound adaptive hybrid execution mode. Contribution/Results: BARQ delivers significant acceleration for CPU-intensive queries—reducing end-to-end latency by over 40% on average—while preserving full performance for disk-intensive and OLTP-style queries, thus achieving substantial speedups without compromising compatibility or generality.

Technology Category

Application Category

📝 Abstract

Stardog is a commercial Knowledge Graph platform built on top of an RDF graph database whose primary means of communication is a standardized graph query language called SPARQL. This paper describes our journey of developing a more performant query execution layer and plugging it into Stardog's query engine. The new executor, called BARQ, is based on the known principle of processing batches of tuples at a time in most critical query operators, particularly joins. In addition to presenting BARQ, the paper describes the challenges of integrating it into a mature, tightly integrated system based on the classical tuple-at-a-time Volcano model. It offers a gradual approach to overcoming the challenges that small- to medium-size engineering teams typically face. Finally, the paper presents experimental results showing that BARQ makes Stardog substantially faster on CPU-bound queries without sacrificing performance on disk-bound and OLTP-style queries.

Problem

Research questions and friction points this paper is trying to address.

Enhancing SPARQL query performance in Stardog

Integrating batch-based execution into Volcano model

Balancing speed for CPU-bound and disk-bound queries

Innovation

Methods, ideas, or system contributions that make the work stand out.

Vectorized batch processing for SPARQL queries

Integration with existing tuple-at-a-time system

Performance boost for CPU-bound queries

🔎 Similar Papers

No similar papers found.