🤖 AI Summary
This paper addresses symbolic counterexample generation for SQL queries: given *n* SQL queries and a desired output property encoded as an SMT formula, the goal is to automatically synthesize an input instance *I* such that the outputs of all queries on *I* satisfy the property—enabling falsification of query equivalence or disambiguation among candidate queries. We propose the first conflict-driven dynamic lower-approximation search framework, which progressively approximates the target semantic behavior via expressive query families, ensuring semantic completeness while improving solving efficiency. Our approach integrates SMT solving, formal SQL semantics modeling, programmatic construction of lower approximations, and conflict-guided incremental search. Evaluated on over 30,000 benchmarks, our method significantly outperforms all existing techniques in both SQL equivalence falsification and query disambiguation tasks.
📝 Abstract
We present a novel symbolic reasoning engine for SQL which can efficiently generate an input $I$ for $n$ queries $P_1, cdots, P_n$, such that their outputs on $I$ satisfy a given property (expressed in SMT). This is useful in different contexts, such as disproving equivalence of two SQL queries and disambiguating a set of queries. Our first idea is to reason about an under-approximation of each $P_i$ -- that is, a subset of $P_i$'s input-output behaviors. While it makes our approach both semantics-aware and lightweight, this idea alone is incomplete (as a fixed under-approximation might miss some behaviors of interest). Therefore, our second idea is to perform search over an expressive family of under-approximations (which collectively cover all program behaviors of interest), thereby making our approach complete. We have implemented these ideas in a tool, Polygon, and evaluated it on over 30,000 benchmarks across two tasks (namely, SQL equivalence refutation and query disambiguation). Our evaluation results show that Polygon significantly outperforms all prior techniques.