🤖 AI Summary
This work addresses the limited generalization of existing Text-to-SQL methods under out-of-distribution and long-tail scenarios, which stems from their reliance on static inference pipelines. To overcome this, we propose SquRL, a novel framework that introduces a dynamic workflow construction mechanism, enabling large language models to adaptively compose heterogeneous strategies during inference via reinforcement learning to generate optimal SQL translation paths. SquRL integrates a rule-based reward function, dynamic actor masking, and a pseudo-reward training scheme to effectively transcend the constraints of fixed procedural designs. Experimental results demonstrate that SquRL consistently outperforms state-of-the-art static approaches across mainstream Text-to-SQL benchmarks, with particularly pronounced gains on complex queries and out-of-distribution instances.
📝 Abstract
Text-to-SQL has recently achieved impressive progress, yet remains difficult to apply effectively in real-world scenarios. This gap stems from the reliance on single static workflows, fundamentally limiting scalability to out-of-distribution and long-tail scenarios. Instead of requiring users to select suitable methods through extensive experimentation, we attempt to enable systems to adaptively construct workflows at inference time. Through theoretical and empirical analysis, we demonstrate that optimal dynamic policies consistently outperform the best static workflow, with performance gains fundamentally driven by heterogeneity across candidate workflows. Motivated by this, we propose SquRL, a reinforcement learning framework that enhances LLMs' reasoning capability in adaptive workflow construction. We design a rule-based reward function and introduce two effective training mechanisms: dynamic actor masking to encourage broader exploration, and pseudo rewards to improve training efficiency. Experiments on widely-used Text-to-SQL benchmarks demonstrate that dynamic workflow construction consistently outperforms the best static workflow methods, with especially pronounced gains on complex and out-of-distribution queries. The codes are available at https://github.com/Satissss/SquRL