Structural Indexing of Relational Databases for the Evaluation of Free-Connex Acyclic Conjunctive Queries

📅 2026-01-08
🏛️ arXiv.org
📈 Citations: 1
Influential: 0
📄 PDF
🤖 AI Summary
This work proposes a novel indexing approach for the efficient evaluation of free-connex acyclic conjunctive queries (fc-ACQs) over relational databases, leveraging structural symmetries inherent in tuple data. By introducing an auxiliary database $D_{col}$ and employing a relation coloring refinement technique, the method constructs a compact structural index that enables linear-time preprocessing and constant-delay enumeration or counting. This is the first approach to exploit internal structural symmetries in relational data, departing from conventional value- or order-based indexing paradigms. The resulting index achieves significant compression on canonical structures such as binary trees and regular graphs—while maintaining worst-case linear size—and supports efficient evaluation of all fc-ACQs in time complexity strictly better than the size of the underlying database.

Technology Category

Application Category

📝 Abstract
We present an index structure to boost the evaluation of free-connex acyclic conjunctive queries (fc-ACQs) over relational databases. The main ingredient of the index associated with a given database $D$ is an auxiliary database $D_{col}$. Our main result states that for any fc-ACQ $Q$ over $D$, we can count the number of answers of $Q$ or enumerate them with constant delay after a preprocessing phase that takes time linear in the size of $D_{col}$. Unlike previous indexing methods based on values or order (e.g., B+ trees), our index is based on structural symmetries among tuples in a database, and the size of $D_{col}$ is related to the number of colors assigned to $D$ by Scheidt and Schweikardt's"relational color refinement"(2025). In the particular case of graphs, this coincides with the minimal size of an equitable partition of the graph. For example, the size of $D_{col}$ is logarithmic in the case of binary trees and constant for regular graphs. Even in the worst-case that $D$ has no structural symmetries among tuples at all, the size of $D_{col}$ is still linear in the size of $D$. Given that the size of $D_{col}$ is bounded by the size of $D$ and can be much smaller (even constant for some families of databases), our index is the first foundational result on indexing internal structural symmetries of a database to evaluate all fc-ACQs with performance potentially strictly smaller than the database size.
Problem

Research questions and friction points this paper is trying to address.

free-connex acyclic conjunctive queries
relational databases
structural symmetries
query evaluation
indexing
Innovation

Methods, ideas, or system contributions that make the work stand out.

structural indexing
free-connex acyclic conjunctive queries
relational color refinement
constant delay enumeration
database symmetries
🔎 Similar Papers
No similar papers found.