Solving Math Word Problems Using Estimation Verification and Equation Generation

📅 2025-09-22
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Large language models (LLMs) suffer from limited mathematical reasoning and symbolic computation capabilities when solving math word problems (MWPs). To address this, we propose EstiMath: a framework that first leverages an LLM to decompose MWPs and generate solvable equations, then invokes an external symbolic solver to obtain an initial solution. Crucially, EstiMath introduces an estimation-based verification mechanism—performing a secondary coarse numerical estimation and comparing it against the initial solution to trigger iterative error correction, thereby substantially improving answer reliability. Notably, EstiMath is the first method to systematically solve trigonometry-oriented MWPs. Evaluated on multiple mainstream MWP benchmarks, it achieves state-of-the-art performance, outperforming prior best methods by 1.9 percentage points in average accuracy. Furthermore, we release SVAMPClean—a high-quality, cleaned MWP dataset—and Trig300—the first dedicated benchmark for trigonometric MWPs—to advance fine-grained evaluation and modeling of domain-specific mathematical reasoning.

Technology Category

Application Category

📝 Abstract
Large Language Models (LLMs) excel at various tasks, including problem-solving and question-answering. However, LLMs often find Math Word Problems (MWPs) challenging because solving them requires a range of reasoning and mathematical abilities with which LLMs seem to struggle. Recent efforts have helped LLMs solve more complex MWPs with improved prompts. This study proposes a novel method that initially prompts an LLM to create equations from a decomposition of the question, followed by using an external symbolic equation solver to produce an answer. To ensure the accuracy of the obtained answer, inspired by an established recommendation of math teachers, the LLM is instructed to solve the MWP a second time, but this time with the objective of estimating the correct answer instead of solving it exactly. The estimation is then compared to the generated answer to verify. If verification fails, an iterative rectification process is employed to ensure the correct answer is eventually found. This approach achieves new state-of-the-art results on datasets used by prior published research on numeric and algebraic MWPs, improving the previous best results by nearly two percent on average. In addition, the approach obtains satisfactory results on trigonometric MWPs, a task not previously attempted to the authors' best knowledge. This study also introduces two new datasets, SVAMPClean and Trig300, to further advance the testing of LLMs' reasoning abilities.
Problem

Research questions and friction points this paper is trying to address.

Solving Math Word Problems using estimation verification and equation generation
Improving LLM accuracy on numeric and algebraic mathematical problems
Verifying answer correctness through iterative estimation comparison process
Innovation

Methods, ideas, or system contributions that make the work stand out.

Decompose questions into equations for solving
Verify answers by comparing with LLM estimations
Employ iterative rectification if verification fails
🔎 Similar Papers
No similar papers found.
M
Mitchell Piehl
Computer Science, University of Iowa
D
Dillon Wilson
Computer Science, University of Colorado, Colorado Springs, USA
A
Ananya Kalita
Foster School of Business, University of Washington, Seattle, Washington
Jugal Kalita
Jugal Kalita
University of Colorado, Colorado Springs
Natural Language ProcessingComputational LinguisticsAnomaly DetectionCybersecurity