🤖 AI Summary
In-context learning (ICL)-driven Text-to-SQL suffers from unclear error root causes, low-accuracy and high-overhead existing repair methods, and poor generalization across learning scenarios. Method: This paper systematically categorizes 29 representative errors across seven classes—spanning four mainstream ICL paradigms and five repair strategy types—and proposes MapleRepair, a lightweight, error-aware repair framework integrating error-pattern analysis, prompt engineering optimization, SQL syntactic/semantic consistency verification, and iterative re-generation. Results: Experiments demonstrate a 13.8% improvement in repair success rate, near-zero erroneous repair rate, 67.4% reduction in computational overhead, and strong cross-model and cross-benchmark generalization—validated on multiple large language models and two major Text-to-SQL benchmarks.
📝 Abstract
Large language models (LLMs) have been adopted to perform text-to-SQL tasks, utilizing their in-context learning (ICL) capability to translate natural language questions into structured query language (SQL). However, such a technique faces correctness problems and requires efficient repairing solutions. In this paper, we conduct the first comprehensive study of text-to-SQL errors. Our study covers four representative ICL-based techniques, five basic repairing methods, two benchmarks, and two LLM settings. We find that text-to-SQL errors are widespread and summarize 29 error types of 7 categories. We also find that existing repairing attempts have limited correctness improvement at the cost of high computational overhead with many mis-repairs. Based on the findings, we propose MapleRepair, a novel text-to-SQL error detection and repairing framework. The evaluation demonstrates that MapleRepair outperforms existing solutions by repairing 13.8% more queries with neglectable mis-repairs and 67.4% less overhead.