SQLens: An End-to-End Framework for Error Detection and Correction in Text-to-SQL

📅 2025-06-04

📈 Citations: 0

✨ Influential: 0

career value

122K/year

🤖 AI Summary

Text-to-SQL systems frequently generate syntactically valid but semantically incorrect SQL queries, leading to execution failures; existing large language model (LLM)-based approaches lack interpretable, fine-grained error localization capabilities. To address this, we propose the first dual-source semantic error detection framework that jointly leverages database execution feedback and LLM internal token-level logits for clause-level error localization and controllable query regeneration. Our method integrates execution trace analysis, logits-based confidence monitoring, error-signal-driven iterative refinement, and rigorous validation—overcoming the limitations of conventional black-box self-verification. Evaluated on the Spider and BIRD benchmarks, our approach achieves a 25.78% absolute improvement in error detection F1-score and up to a 20% gain in end-to-end execution accuracy. This significantly enhances the robustness and debuggability of Text-to-SQL systems.

Technology Category

Application Category

📝 Abstract

Text-to-SQL systems translate natural language (NL) questions into SQL queries, enabling non-technical users to interact with structured data. While large language models (LLMs) have shown promising results on the text-to-SQL task, they often produce semantically incorrect yet syntactically valid queries, with limited insight into their reliability. We propose SQLens, an end-to-end framework for fine-grained detection and correction of semantic errors in LLM-generated SQL. SQLens integrates error signals from both the underlying database and the LLM to identify potential semantic errors within SQL clauses. It further leverages these signals to guide query correction. Empirical results on two public benchmarks show that SQLens outperforms the best LLM-based self-evaluation method by 25.78% in F1 for error detection, and improves execution accuracy of out-of-the-box text-to-SQL systems by up to 20%.

Problem

Research questions and friction points this paper is trying to address.

Detects semantic errors in LLM-generated SQL queries

Corrects SQL queries using database and LLM error signals

Improves execution accuracy of text-to-SQL systems

Innovation

Methods, ideas, or system contributions that make the work stand out.

End-to-end framework for SQL error detection

Integrates database and LLM error signals

Improves execution accuracy by 20%

🔎 Similar Papers

SQLFixAgent: Towards Semantic-Accurate Text-to-SQL Parsing via Consistency-Enhanced Multi-Agent Collaboration