A Unifying Algorithm for Hierarchical Queries

๐Ÿ“… 2025-06-11
๐Ÿ“ˆ Citations: 0
โœจ Influential: 0
๐Ÿ“„ PDF
๐Ÿค– AI Summary
This paper establishes the tractability boundary between hierarchical and non-hierarchical queries for the bag-set maximization problem: given a database and a self-join-free Boolean conjunctive query (SJF-BCQ), maximize the bag-semantics output size under a budget on the number of newly inserted facts. Method: The authors prove that the problem is polynomial-time solvable if and only if the SJF-BCQ is hierarchical; otherwise, it is NP-hard. They further unify its algebraic structure with probabilistic database query evaluation and Shapley value computation over facts, developing a general algebraic framework based on 2-monoids. Contribution/Results: The work identifies hierarchy as the precise dichotomy for tractability of bag-set maximization and provides an O(nแต) unified polynomial-time algorithm for all three problems. This yields the first cross-semantic (probabilistic, explainable AI, optimization) algebraic paradigm for their joint resolution.

Technology Category

Application Category

๐Ÿ“ Abstract
The class of hierarchical queries is known to define the boundary of the dichotomy between tractability and intractability for the following two extensively studied problems about self-join free Boolean conjunctive queries (SJF-BCQ): (i) evaluating a SJF-BCQ on a tuple-independent probabilistic database; (ii) computing the Shapley value of a fact in a database on which a SJF-BCQ evaluates to true. Here, we establish that hierarchical queries define also the boundary of the dichotomy between tractability and intractability for a different natural algorithmic problem, which we call the"bag-set maximization"problem. The bag-set maximization problem associated with a SJF-BCQ $Q$ asks: given a database $cal D$, find the biggest value that $Q$ takes under bag semantics on a database $cal D'$ obtained from $cal D$ by adding at most $ heta$ facts from another given database $cal D^r$. For non-hierarchical queries, we show that the bag-set maximization problem is an NP-complete optimization problem. More significantly, for hierarchical queries, we show that all three aforementioned problems (probabilistic query evaluation, Shapley value computation, and bag-set maximization) admit a single unifying polynomial-time algorithm that operates on an abstract algebraic structure, called a"2-monoid". Each of the three problems requires a different instantiation of the 2-monoid tailored for the problem at hand.
Problem

Research questions and friction points this paper is trying to address.

Establish dichotomy for bag-set maximization in hierarchical queries
Unify three problems via 2-monoid algebraic structure algorithm
Prove NP-completeness of bag-set maximization for non-hierarchical queries
Innovation

Methods, ideas, or system contributions that make the work stand out.

Unifying algorithm for hierarchical queries
Uses 2-monoid abstract algebraic structure
Polynomial-time solution for multiple problems
๐Ÿ”Ž Similar Papers
No similar papers found.