From Questions to Queries: An AI-powered Multi-Agent Framework for Spatial Text-to-SQL

📅 2025-10-23
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Non-experts face significant barriers in effectively analyzing geospatial data due to the steep learning curve of SQL and spatial tools like PostGIS. To address this, we propose a collaborative multi-agent framework that overcomes key limitations of single-agent approaches in spatial Text-to-SQL tasks—particularly in semantic understanding and modeling of complex spatial predicates. Our method innovatively integrates procedural pattern analysis, a semantics-enhanced knowledge base, and a self-verification mechanism, unifying embedding-based retrieval, entity/metadata extraction, logical modeling, and SQL generation. A novel syntax–semantics dual verification module further ensures correctness and intent alignment. Evaluated on KaggleDBQA and our newly constructed SpatialQueryQA benchmark, our framework achieves 81.2% and 87.7% exact-match accuracy, respectively—substantially outperforming unverified baselines and yielding queries with markedly improved fidelity to user intent.

Technology Category

Application Category

📝 Abstract
The complexity of Structured Query Language (SQL) and the specialized nature of geospatial functions in tools like PostGIS present significant barriers to non-experts seeking to analyze spatial data. While Large Language Models (LLMs) offer promise for translating natural language into SQL (Text-to-SQL), single-agent approaches often struggle with the semantic and syntactic complexities of spatial queries. To address this, we propose a multi-agent framework designed to accurately translate natural language questions into spatial SQL queries. The framework integrates several innovative components, including a knowledge base with programmatic schema profiling and semantic enrichment, embeddings for context retrieval, and a collaborative multi-agent pipeline as its core. This pipeline comprises specialized agents for entity extraction, metadata retrieval, query logic formulation, SQL generation, and a review agent that performs programmatic and semantic validation of the generated SQL to ensure correctness (self-verification). We evaluate our system using both the non-spatial KaggleDBQA benchmark and a new, comprehensive SpatialQueryQA benchmark that includes diverse geometry types, predicates, and three levels of query complexity. On KaggleDBQA, the system achieved an overall accuracy of 81.2% (221 out of 272 questions) after the review agent's review and corrections. For spatial queries, the system achieved an overall accuracy of 87.7% (79 out of 90 questions), compared with 76.7% without the review agent. Beyond accuracy, results also show that in some instances the system generates queries that are more semantically aligned with user intent than those in the benchmarks. This work makes spatial analysis more accessible, and provides a robust, generalizable foundation for spatial Text-to-SQL systems, advancing the development of autonomous GIS.
Problem

Research questions and friction points this paper is trying to address.

Translating natural language questions into spatial SQL queries for non-experts
Overcoming semantic and syntactic complexities in spatial query generation
Improving accuracy of spatial Text-to-SQL systems through multi-agent collaboration
Innovation

Methods, ideas, or system contributions that make the work stand out.

Multi-agent framework for spatial text-to-SQL translation
Knowledge base with schema profiling and semantic enrichment
Review agent performs programmatic and semantic SQL validation
🔎 Similar Papers
2023-12-18International Conference on Computational LinguisticsCitations: 35
A
Ali Khosravi Kazazi
Geoinformation and Big Data Research Laboratory, Department of Geography, The Pennsylvania State University, University Park, PA, USA
Zhenlong Li
Zhenlong Li
Associate Professor, The Pennsylvania State University
GIScienceGeospatial Big DataSpatial ComputingAutonomous GISGeoAI/AgenticAI
M
M. Naser Lessani
Geoinformation and Big Data Research Laboratory, Department of Geography, The Pennsylvania State University, University Park, PA, USA
Guido Cervone
Guido Cervone
The Pennsylvania State University (Penn State - PSU)
Machine learningspatio-temporal data miningnatural hazardsremote sensing