🤖 AI Summary
To address insufficient semantic parsing accuracy in Text-to-SQL tasks, this paper proposes a multi-generator collaborative framework. First, multiple SQL generators—diverse in output format and jointly fine-tuned on schema-aware and text-SQL alignment objectives—are constructed to enhance candidate diversity. Second, schema-guided filtering and structured candidate reorganization improve semantic consistency. Finally, a lightweight selection model discriminates the optimal SQL query. Evaluated on BIRD, the method achieves 75.63% accuracy (state-of-the-art), and 89.65% on Spider’s test set—substantially outperforming existing single-generator and ensemble approaches. Key contributions include: (1) multi-format fine-tuning for controllable generation diversity; (2) a structured candidate reorganization mechanism enforcing syntactic and semantic coherence; and (3) a low-overhead selection optimization paradigm that avoids costly end-to-end retraining while preserving performance.
📝 Abstract
To leverage the advantages of LLM in addressing challenges in the Text-to-SQL task, we present XiYan-SQL, an innovative framework effectively generating and utilizing multiple SQL candidates. It consists of three components: 1) a Schema Filter module filtering and obtaining multiple relevant schemas; 2) a multi-generator ensemble approach generating multiple highquality and diverse SQL queries; 3) a selection model with a candidate reorganization strategy implemented to obtain the optimal SQL query. Specifically, for the multi-generator ensemble, we employ a multi-task fine-tuning strategy to enhance the capabilities of SQL generation models for the intrinsic alignment between SQL and text, and construct multiple generation models with distinct generation styles by fine-tuning across different SQL formats. The experimental results and comprehensive analysis demonstrate the effectiveness and robustness of our framework. Overall, XiYan-SQL achieves a new SOTA performance of 75.63% on the notable BIRD benchmark, surpassing all previous methods. It also attains SOTA performance on the Spider test set with an accuracy of 89.65%.