DeKeyNLU: Enhancing Natural Language to SQL Generation through Task Decomposition and Keyword Extraction

📅 2025-09-17
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
In NL2SQL, large language models (LLMs) suffer from two key bottlenecks: overly coarse-grained task decomposition and inaccurate identification of domain-specific keywords, leading to high SQL generation error rates. Moreover, existing benchmarks lack fine-grained task segmentation and explicit keyword annotations, hindering model interpretability and performance. To address these issues, we propose DeKeyNLU—a high-quality dataset featuring explicit hierarchical task decomposition and domain keyword labeling—and DeKeySQL, an end-to-end pipeline comprising three modules: question understanding, entity retrieval, and SQL generation. DeKeySQL integrates retrieval-augmented generation (RAG) with chain-of-thought (CoT) reasoning to enhance semantic grounding. Evaluated on BIRD and Spider, our approach achieves +6.79% and +4.5% absolute improvements in execution accuracy, respectively, effectively mitigating over-decomposition and keyword omission. This work establishes a more interpretable and scalable paradigm for semantic understanding in NL2SQL.

Technology Category

Application Category

📝 Abstract
Natural Language to SQL (NL2SQL) provides a new model-centric paradigm that simplifies database access for non-technical users by converting natural language queries into SQL commands. Recent advancements, particularly those integrating Retrieval-Augmented Generation (RAG) and Chain-of-Thought (CoT) reasoning, have made significant strides in enhancing NL2SQL performance. However, challenges such as inaccurate task decomposition and keyword extraction by LLMs remain major bottlenecks, often leading to errors in SQL generation. While existing datasets aim to mitigate these issues by fine-tuning models, they struggle with over-fragmentation of tasks and lack of domain-specific keyword annotations, limiting their effectiveness. To address these limitations, we present DeKeyNLU, a novel dataset which contains 1,500 meticulously annotated QA pairs aimed at refining task decomposition and enhancing keyword extraction precision for the RAG pipeline. Fine-tuned with DeKeyNLU, we propose DeKeySQL, a RAG-based NL2SQL pipeline that employs three distinct modules for user question understanding, entity retrieval, and generation to improve SQL generation accuracy. We benchmarked multiple model configurations within DeKeySQL RAG pipeline. Experimental results demonstrate that fine-tuning with DeKeyNLU significantly improves SQL generation accuracy on both BIRD (62.31% to 69.10%) and Spider (84.2% to 88.7%) dev datasets.
Problem

Research questions and friction points this paper is trying to address.

Addresses inaccurate task decomposition in NL2SQL generation
Improves keyword extraction precision for SQL queries
Overcomes dataset limitations with domain-specific annotations
Innovation

Methods, ideas, or system contributions that make the work stand out.

Novel dataset for task decomposition refinement
RAG-based pipeline with three distinct modules
Fine-tuning enhances SQL generation accuracy
🔎 Similar Papers
No similar papers found.
J
Jian Chen
The Hong Kong University of Science and Technology (Guangzhou)
Z
Zhenyan Chen
South China University of Technology
Xuming Hu
Xuming Hu
Assistant Professor, HKUST(GZ) / HKUST
Natural Language ProcessingLarge Language Model
Peilin Zhou
Peilin Zhou
HKUST; Peking University
sequential recommendationnatural language processing
Y
Yining Hua
Harvard University
H
Han Fang
HSBC
C
Cissy Hing Yee Choy
Chicago University
X
Xinmei Ke
HSBC
J
Jingfeng Luo
HSBC
Zixuan Yuan
Zixuan Yuan
Postdoctoral assosicate, University of Rochester
AnesthesiaLipidSpatial BiochemistryIon Channel