DeepEye-SQL: A Software-Engineering-Inspired Text-to-SQL Framework

📅 2025-10-20

📈 Citations: 0

✨ Influential: 0

career value

200K/year

🤖 AI Summary

Current Text-to-SQL methods suffer from insufficient system-level reliability, primarily due to the absence of structured, verifiable workflow orchestration. This paper proposes a paradigm shift: modeling Text-to-SQL as a software engineering problem and designing an SDLC-inspired framework for reliable SQL generation. The framework comprises semantic alignment, N-version parallel generation, toolchain-driven deterministic verification, and confidence-aware pairwise arbitration. It introduces the first end-to-end architecture integrating semantic-value retrieval, robust schema linking, multi-path reasoning, LLM-guided correction, and unit-test-based validation. Without fine-tuning and using only an open-source ~30B LLM, it achieves 73.5% execution accuracy on BIRD-Dev and 89.8% on Spider-Test—substantially surpassing state-of-the-art methods. These results validate the effectiveness and scalability of the software engineering paradigm for complex semantic parsing tasks.

Technology Category

Application Category

📝 Abstract

Large language models (LLMs) have advanced Text-to-SQL, yet existing solutions still fall short of system-level reliability. The limitation is not merely in individual modules - e.g., schema linking, reasoning, and verification - but more critically in the lack of structured orchestration that enforces correctness across the entire workflow. This gap motivates a paradigm shift: treating Text-to-SQL not as free-form language generation but as a software-engineering problem that demands structured, verifiable orchestration. We present DeepEye-SQL, a software-engineering-inspired framework that reframes Text-to-SQL as the development of a small software program, executed through a verifiable process guided by the Software Development Life Cycle (SDLC). DeepEye-SQL integrates four synergistic stages: it grounds ambiguous user intent through semantic value retrieval and robust schema linking; enhances fault tolerance with N-version SQL generation using diverse reasoning paradigms; ensures deterministic verification via a tool-chain of unit tests and targeted LLM-guided revision; and introduces confidence-aware selection that clusters execution results to estimate confidence and then takes a high-confidence shortcut or runs unbalanced pairwise adjudication in low-confidence cases, yielding a calibrated, quality-gated output. This SDLC-aligned workflow transforms ad hoc query generation into a disciplined engineering process. Using ~30B open-source LLMs without any fine-tuning, DeepEye-SQL achieves 73.5% execution accuracy on BIRD-Dev and 89.8% on Spider-Test, outperforming state-of-the-art solutions. This highlights that principled orchestration, rather than LLM scaling alone, is key to achieving system-level reliability in Text-to-SQL.

Problem

Research questions and friction points this paper is trying to address.

Addressing system-level reliability gaps in Text-to-SQL frameworks

Treating Text-to-SQL as a structured software engineering problem

Providing verifiable orchestration across the SQL generation workflow

Innovation

Methods, ideas, or system contributions that make the work stand out.

Software-engineering-inspired framework for Text-to-SQL

Multi-stage SDLC-aligned workflow with verification

Confidence-aware selection with clustering and adjudication

🔎 Similar Papers

A Survey on Employing Large Language Models for Text-to-SQL Tasks