SQLGovernor: An LLM-powered SQL Toolkit for Real World Application

📅 2025-09-10
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
In real-world OLAP analytics, SQL queries frequently suffer from syntactic errors, inefficient execution, and semantic inconsistencies. To address these challenges, this paper proposes a knowledge-driven, fragment-based SQL intelligence governance framework. The framework decomposes queries into verifiable fragments and integrates large language model (LLM) reasoning with multi-source rule validation, database execution feedback, and expert-guided hybrid self-learning—thereby substantially reducing LLM cognitive load and enabling continuous optimization. Evaluated on BIRD, BIRD CRITIC, and industrial datasets, our method improves base-model SQL generation accuracy by up to 10% and significantly reduces manual intervention. Deployed in production environments, the framework demonstrates robustness and practical efficacy for enterprise-scale analytical workloads.

Technology Category

Application Category

📝 Abstract
SQL queries in real world analytical environments, whether written by humans or generated automatically often suffer from syntax errors, inefficiency, or semantic misalignment, especially in complex OLAP scenarios. To address these challenges, we propose SQLGovernor, an LLM powered SQL toolkit that unifies multiple functionalities, including syntax correction, query rewriting, query modification, and consistency verification within a structured framework enhanced by knowledge management. SQLGovernor introduces a fragment wise processing strategy to enable fine grained rewriting and localized error correction, significantly reducing the cognitive load on the LLM. It further incorporates a hybrid self learning mechanism guided by expert feedback, allowing the system to continuously improve through DBMS output analysis and rule validation. Experiments on benchmarks such as BIRD and BIRD CRITIC, as well as industrial datasets, show that SQLGovernor consistently boosts the performance of base models by up to 10%, while minimizing reliance on manual expertise. Deployed in production environments, SQLGovernor demonstrates strong practical utility and effective performance.
Problem

Research questions and friction points this paper is trying to address.

Addressing syntax errors and inefficiency in real-world SQL queries
Resolving semantic misalignment in complex OLAP analytical environments
Reducing cognitive load on LLMs through fine-grained query processing
Innovation

Methods, ideas, or system contributions that make the work stand out.

LLM-powered SQL toolkit with unified functionalities
Fragment-wise processing for fine-grained error correction
Hybrid self-learning mechanism with expert feedback
🔎 Similar Papers
No similar papers found.
J
Jie Jiang
Department of Data Platform, TEG, Tencent Inc.
S
Siqi Shen
Center of Machine Learning Research, Peking University
H
Haining Xie
Department of Data Platform, TEG, Tencent Inc.
Y
Yang Li
Department of Data Platform, TEG, Tencent Inc.
Y
Yu Shen
Department of Data Platform, TEG, Tencent Inc.
Danqing Huang
Danqing Huang
Microsoft
Natural Language ProcessingDesign Intelligence
B
Bo Qian
Department of Data Platform, TEG, Tencent Inc.
Y
Yinjun Wu
School of Computer Science, Peking University
Wentao Zhang
Wentao Zhang
Institute of Physics, Chinese Academy of Sciences
photoemissionsuperconductivitycupratehtsctime-resolved
B
Bin Cui
School of Computer Science, Peking University
P
Peng Chen
Department of Data Platform, TEG, Tencent Inc.