SpotIt+: Verification-based Text-to-SQL Evaluation with Database Constraints

📅 2026-03-04

📈 Citations: 0

✨ Influential: 0

career value

134K/year

🤖 AI Summary

This work addresses the limitations of existing Text-to-SQL evaluation methods, which often fail to capture semantic discrepancies between generated and reference SQL queries, particularly in the absence of real database constraints. To overcome this, the authors propose a bounded equivalence verification framework that actively searches for database instances capable of distinguishing the semantics of two queries. The core innovation lies in integrating rule-driven constraint mining with large language model–based validation, ensuring that the generated counterexamples are both semantically discriminative and realistic in practical deployment scenarios. Experiments on the BIRD dataset demonstrate that the proposed approach efficiently uncovers numerous semantic errors missed by conventional evaluation metrics, thereby substantially enhancing the validity and fidelity of Text-to-SQL system assessment.

Technology Category

Application Category

📝 Abstract

We present SpotIt+, an open-source tool for evaluating Text-to-SQL systems via bounded equivalence verification. Given a generated SQL query and the ground truth, SpotIt+ actively searches for database instances that differentiate the two queries. To ensure that the generated counterexamples reflect practically relevant discrepancies, we introduce a constraint-mining pipeline that combines rule-based specification mining over example databases with LLM-based validation. Experimental results on the BIRD dataset show that the mined constraints enable SpotIt+ to generate more realistic differentiating databases, while preserving its ability to efficiently uncover numerous discrepancies between generated and gold SQL queries that are missed by standard test-based evaluation.

Problem

Research questions and friction points this paper is trying to address.

Text-to-SQL evaluation

semantic equivalence

database constraints

counterexample generation

verification

Innovation

Methods, ideas, or system contributions that make the work stand out.

verification-based evaluation

constraint mining

Text-to-SQL