🤖 AI Summary
SPARQL query performance on the Stardog knowledge graph platform suffers from severe CPU bottlenecks in CPU-intensive workloads, while the traditional Volcano execution model fails to simultaneously optimize for disk I/O-bound and OLTP-style queries. Method: This paper proposes BARQ—the first vectorized query execution framework designed for industrial-grade SPARQL engines. BARQ refactors critical operators (especially joins) using a batched vectorized paradigm and introduces a progressive integration mechanism enabling seamless deployment without modifying existing query interfaces. It further innovates with a CPU/disk-bound adaptive hybrid execution mode. Contribution/Results: BARQ delivers significant acceleration for CPU-intensive queries—reducing end-to-end latency by over 40% on average—while preserving full performance for disk-intensive and OLTP-style queries, thus achieving substantial speedups without compromising compatibility or generality.
📝 Abstract
Stardog is a commercial Knowledge Graph platform built on top of an RDF graph database whose primary means of communication is a standardized graph query language called SPARQL. This paper describes our journey of developing a more performant query execution layer and plugging it into Stardog's query engine. The new executor, called BARQ, is based on the known principle of processing batches of tuples at a time in most critical query operators, particularly joins. In addition to presenting BARQ, the paper describes the challenges of integrating it into a mature, tightly integrated system based on the classical tuple-at-a-time Volcano model. It offers a gradual approach to overcoming the challenges that small- to medium-size engineering teams typically face. Finally, the paper presents experimental results showing that BARQ makes Stardog substantially faster on CPU-bound queries without sacrificing performance on disk-bound and OLTP-style queries.