Continual Learning of Domain Knowledge from Human Feedback in Text-to-SQL

📅 2025-11-10

📈 Citations: 0

✨ Influential: 0

career value

177K/year

🤖 AI Summary

Large language models (LLMs) struggle to jointly leverage database schemas and implicit domain knowledge in Text-to-SQL tasks, and lack mechanisms for continuous learning from human feedback. Method: This paper proposes a memory-augmented, learnable agent framework. Its core innovation is the automatic distillation of natural-language human feedback into structured, procedural memory—enabling cross-task knowledge accumulation, retrieval, and reuse—and a feedback-driven memory update mechanism that dynamically models schema semantics and domain rules. Contribution/Results: Evaluated on the BIRD development set, the proposed Procedural Agent achieves significant gains in execution accuracy and substantial error reduction. It is the first work to empirically validate the effectiveness and scalability of a structured-memory-based, human-in-the-loop continual learning paradigm for Text-to-SQL.

Technology Category

Application Category

📝 Abstract

Large Language Models (LLMs) can generate SQL queries from natural language questions but struggle with database-specific schemas and tacit domain knowledge. We introduce a framework for continual learning from human feedback in text-to-SQL, where a learning agent receives natural language feedback to refine queries and distills the revealed knowledge for reuse on future tasks. This distilled knowledge is stored in a structured memory, enabling the agent to improve execution accuracy over time. We design and evaluate multiple variations of a learning agent architecture that vary in how they capture and retrieve past experiences. Experiments on the BIRD benchmark Dev set show that memory-augmented agents, particularly the Procedural Agent, achieve significant accuracy gains and error reduction by leveraging human-in-the-loop feedback. Our results highlight the importance of transforming tacit human expertise into reusable knowledge, paving the way for more adaptive, domain-aware text-to-SQL systems that continually learn from a human-in-the-loop.

Problem

Research questions and friction points this paper is trying to address.

LLMs struggle with database-specific schemas in text-to-SQL

Tacit domain knowledge is not captured by current text-to-SQL systems

Systems lack continual learning from human feedback for SQL refinement

Innovation

Methods, ideas, or system contributions that make the work stand out.

Continual learning framework from human feedback

Structured memory stores distilled domain knowledge

Memory-augmented agents improve SQL execution accuracy

🔎 Similar Papers

A Survey on Employing Large Language Models for Text-to-SQL Tasks