HI-SQL: Optimizing Text-to-SQL Systems through Dynamic Hint Integration

📅 2025-06-11

🏛️ IEEE International Joint Conference on Neural Network

📈 Citations: 0

✨ Influential: 0

career value

142K/year

🤖 AI Summary

Text-to-SQL systems suffer from low accuracy, high latency, and severe error propagation when handling complex queries involving multi-table joins and nested conditions—primarily due to reliance on multi-stage pipelines and manually engineered prompts. To address this, we propose a history-aware dynamic prompt generation mechanism that leverages historical query logs to inject lightweight, context-sensitive hints directly into the input sequence. Our approach tightly integrates query log analysis, adaptive prompt engineering, and end-to-end LLM inference—marking the first such unification. It eliminates multi-stage processing and manual intervention entirely. Evaluated on Spider and BIRD benchmarks, our method improves execution accuracy by +8.2%, reduces LLM API calls by 37%, and cuts end-to-end latency by 29%. The solution achieves both computational efficiency and practical deployability.

Technology Category

Application Category

📝 Abstract

Text-to-SQL generation bridges the gap between natural language and databases, enabling users to query data without requiring SQL expertise. While large language models (LLMs) have significantly advanced the field, challenges remain in handling complex queries that involve multi-table joins, nested conditions, and intricate operations. Existing methods often rely on multi-step pipelines that incur high computational costs, increase latency, and are prone to error propagation. To address these limitations, we propose HI-SQL, a pipeline that incorporates a novel hint generation mechanism utilizing historical query logs to guide SQL generation. By analyzing prior queries, our method generates contextual hints that focus on handling the complexities of multi-table and nested operations. These hints are seamlessly integrated into the SQL generation process, eliminating the need for costly multi-step approaches and reducing reliance on human-crafted prompts. Experimental evaluations on multiple benchmark datasets demonstrate that our approach significantly improves query accuracy of LLM-generated queries while ensuring efficiency in terms of LLM calls and latency, offering a robust and practical solution for enhancing Text-to-SQL systems.

Problem

Research questions and friction points this paper is trying to address.

Optimizes Text-to-SQL systems for complex queries

Reduces computational costs and latency in SQL generation

Improves accuracy by integrating dynamic hints from query logs

Innovation

Methods, ideas, or system contributions that make the work stand out.

Dynamic hint generation from historical query logs

Seamless integration of contextual hints into SQL generation

Eliminates costly multi-step pipelines for efficiency

🔎 Similar Papers

A Survey on Employing Large Language Models for Text-to-SQL Tasks