Enhancing Text-to-SQL Capabilities of Large Language Models via Domain Database Knowledge Injection

📅 2024-09-24
🏛️ European Conference on Artificial Intelligence
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Large language models (LLMs) frequently exhibit hallucination errors in Text-to-SQL tasks—such as generating invalid column names or mismatching values to incorrect columns—due to insufficient domain knowledge of database schemas (e.g., table/column names) and real-world cell values. To address this, we propose a database knowledge injection framework that jointly models structured schema information and authentic cell values as prior knowledge, pre-training LLMs to internalize this relational semantics. This is further enhanced via downstream fine-tuning and schema-aware prompting. Our approach significantly improves semantic understanding and generalization across diverse database schemas, achieving state-of-the-art exact match (EM) and execution accuracy (EX) on multiple benchmarks—including Spider, Bird, and DuSQL. It effectively mitigates column name hallucination and value-column alignment errors, while demonstrating strong cross-database transferability without requiring task-specific retraining.

Technology Category

Application Category

📝 Abstract
Text-to-SQL is a subtask in semantic parsing that has seen rapid progress with the evolution of Large Language Models (LLMs). However, LLMs face challenges due to hallucination issues and a lack of domain-specific database knowledge(such as table schema and cell values). As a result, they can make errors in generating table names, columns, and matching values to the correct columns in SQL statements. This paper introduces a method of knowledge injection to enhance LLMs' ability to understand schema contents by incorporating prior knowledge. This approach improves their performance in Text-to-SQL tasks. Experimental results show that pre-training LLMs on domain-specific database knowledge and fine-tuning them on downstream Text-to-SQL tasks significantly improves the Execution Match (EX) and Exact Match (EM) metrics across various models. This effectively reduces errors in generating column names and matching values to the columns. Furthermore, the knowledge-injected models can be applied to many downstream Text-to-SQL tasks, demonstrating the generalizability of the approach presented in this paper.
Problem

Research questions and friction points this paper is trying to address.

Improving LLMs' Text-to-SQL accuracy
Reducing column name generation errors
Enhancing value-to-column matching precision
Innovation

Methods, ideas, or system contributions that make the work stand out.

Domain database knowledge injection
Pre-training on specific databases
Fine-tuning for Text-to-SQL tasks
🔎 Similar Papers
Xingyu Ma
Xingyu Ma
School of Cyber Science Engineering, Huazhong University of Science and Technology
X
Xin Tian
Wuhan AI Research
L
Lingxiang Wu
Foundation Model Research Center, Institute of Automation, Chinese Academy of Sciences
X
Xuepeng Wang
Foundation Model Research Center, Institute of Automation, Chinese Academy of Sciences
X
Xueming Tang
School of Cyber Science Engineering, Huazhong University of Science and Technology
J
Jinqiao Wang
Foundation Model Research Center, Institute of Automation, Chinese Academy of Sciences; School of Artificial Intelligence, University of Chinese Academy of Sciences; Peng Cheng Laboratory