OraPlan-SQL: A Planning-Centric Framework for Complex Bilingual NL2SQL Reasoning

📅 2025-10-27

📈 Citations: 0

✨ Influential: 0

career value

173K/year

🤖 AI Summary

To address arithmetic, commonsense, and hypothetical reasoning challenges in complex bilingual NL2SQL tasks—as well as cross-lingual transliteration and entity mismatch issues—this paper proposes an agent-based framework centered on stepwise natural language planning. The method introduces a collaborative Planner Agent and SQL Agent architecture, integrating entity linking guidance, multi-candidate plan generation, LLM-distilled error-pattern modeling, and feedback-driven meta-prompt optimization. Correction principles are derived from human-annotated failure-case clustering, while plan diversification and majority voting enhance robustness. Evaluated on English and Chinese benchmarks, the framework achieves execution accuracies of 55.0% and 56.7%, respectively—outperforming the second-best system by over six percentage points—while maintaining a stable SQL validity rate of ≥99%.

Technology Category

Application Category

📝 Abstract

We present OraPlan-SQL, our system for the Archer NL2SQL Evaluation Challenge 2025, a bilingual benchmark requiring complex reasoning such as arithmetic, commonsense, and hypothetical inference. OraPlan-SQL ranked first, exceeding the second-best system by more than 6% in execution accuracy (EX), with 55.0% in English and 56.7% in Chinese, while maintaining over 99% SQL validity (VA). Our system follows an agentic framework with two components: Planner agent that generates stepwise natural language plans, and SQL agent that converts these plans into executable SQL. Since SQL agent reliably adheres to the plan, our refinements focus on the planner. Unlike prior methods that rely on multiple sub-agents for planning and suffer from orchestration overhead, we introduce a feedback-guided meta-prompting strategy to refine a single planner. Failure cases from a held-out set are clustered with human input, and an LLM distills them into corrective guidelines that are integrated into the planner's system prompt, improving generalization without added complexity. For the multilingual scenario, to address transliteration and entity mismatch issues, we incorporate entity-linking guidelines that generate alternative surface forms for entities and explicitly include them in the plan. Finally, we enhance reliability through plan diversification: multiple candidate plans are generated for each query, with the SQL agent producing a query for each plan, and final output selected via majority voting over their executions.

Problem

Research questions and friction points this paper is trying to address.

Solving complex bilingual NL2SQL reasoning with planning-centric framework

Addressing multilingual entity mismatch through entity-linking guidelines

Improving reliability via plan diversification and majority voting

Innovation

Methods, ideas, or system contributions that make the work stand out.

Feedback-guided meta-prompting refines single planner agent

Entity-linking guidelines resolve multilingual surface form mismatches

Plan diversification with majority voting enhances execution reliability

🔎 Similar Papers

A Survey of NL2SQL with Large Language Models: Where are we, and where are we going?