VERIFY-RL: Verifiable Recursive Decomposition for Reinforcement Learning in Mathematical Reasoning

📅 2026-02-07
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the limitations of existing problem decomposition methods in mathematical reasoning, which often rely on heuristic strategies without rigorous guarantees regarding the simplicity, validity, or mathematical soundness of subproblems. The authors propose a verifiable recursive decomposition framework grounded in symbolic differentiation, enforcing three formal conditions at each decomposition step: strictly decreasing structural complexity, solution containment, and derivability via formal rules. This framework introduces, for the first time, automatically verifiable decomposition criteria that enable “verification by construction,” thereby eliminating invalid decompositions at their source. By integrating symbolic computation with reinforcement learning and curriculum learning, and leveraging formal calculus rules for decomposition, the approach achieves a significant performance gain—boosting accuracy from 32% to 68% on the most challenging mathematical problems, representing a 40% relative improvement in overall performance.

Technology Category

Application Category

📝 Abstract
Training language models to solve complex mathematical problems benefits from curriculum learning progressively training on simpler subproblems. However, existing decomposition methods are often heuristic, offering no guarantees that subproblems are simpler, that solving them aids the parent task, or that their relationships are mathematically grounded. We observe that symbolic differentiation provides a natural structure for verified decomposition: calculus rules explicitly define how expressions reduce to simpler components with provable properties. We introduce Verify-RL, a framework where every parent-child decomposition satisfies three verifiable conditions: strictly decreasing structural complexity, solution containment, and formal rule derivation. Unlike heuristic methods where a significant fraction of decompositions are invalid our properties admit automatic verification through symbolic computation, achieving"verification by construction"Experiments demonstrate that eliminating invalid decompositions yields sizable gains, accuracy on the hardest problems more than doubles from 32% to 68%, with a 40% relative improvement overall.
Problem

Research questions and friction points this paper is trying to address.

mathematical reasoning
problem decomposition
verifiability
reinforcement learning
symbolic computation
Innovation

Methods, ideas, or system contributions that make the work stand out.

verifiable decomposition
symbolic differentiation
reinforcement learning
mathematical reasoning
curriculum learning
🔎 Similar Papers
No similar papers found.
Kaleem Ullah Qasim
Kaleem Ullah Qasim
School of Computing and Artificial Intelligence, Southwest Jiaotong University
Reasoning in LLMsPrompt EngineeringLLM Agents
J
Jiashu Zhang
School of Computing and Artificial Intelligence, Southwest Jiaotong University, Chengdu, 611756, China
Hao Li
Hao Li
Associate Professor, Xi'an Jiaotong University
Network MeasurementSoftware Defined Networking
M
Muhammad Kafeel Shaheen
School of Computing and Artificial Intelligence, Southwest Jiaotong University, Chengdu, 611756, China