MathAgent: Adversarial Evolution of Constraint Graphs for Mathematical Reasoning Data Synthesis

📅 2026-04-13

📈 Citations: 0

✨ Influential: 0

career value

165K/year

🤖 AI Summary

Existing approaches to synthesizing mathematical reasoning data often suffer from mode collapse and oversimplified logic. This work proposes a hierarchical synthesis framework that formulates data generation as an unsupervised optimization problem over a constraint graph, innovatively decoupling logical structure from linguistic expression. In this framework, a Legislator module employs adversarial evolution to produce diverse and complex logical constraint blueprints, which an Executor module then semantically instantiates into natural language questions. Departing from conventional end-to-end generation paradigms, the method achieves strong performance: fine-tuning ten prominent models—including Qwen, Llama, Mistral, and Gemma—with only 1K synthetically generated samples consistently outperforms state-of-the-art datasets such as LIMO and s1K across eight mathematical reasoning benchmarks, significantly enhancing out-of-distribution generalization.

Technology Category

Application Category

📝 Abstract

Synthesizing high-quality mathematical reasoning data without human priors remains a significant challenge. Current approaches typically rely on seed data mutation or simple prompt engineering, often suffering from mode collapse and limited logical complexity. This paper proposes a hierarchical synthesis framework that formulates data synthesis as an unsupervised optimization problem over a constraint graph followed by semantic instantiation, rather than treating it as a direct text generation task. We introduce a Legislator-Executor paradigm: The Legislator adversarially evolves structured generation blueprints encoding the constraints of the problem, while the Executor instantiates these specifications into diverse natural language scenarios. This decoupling of skeleton design from linguistic realization enables a prioritized focus on constructing complex and diverse logical structures, thereby guiding high-quality data synthesis. Experiments conducted on a total of 10 models across the Qwen, Llama, Mistral, and Gemma series demonstrate that our method achieves notable results: models fine-tuned on 1K synthesized samples outperform widely-used datasets of comparable scale (LIMO, s1K) across eight mathematical benchmarks, exhibiting superior out-of-distribution generalization.

Problem

Research questions and friction points this paper is trying to address.

mathematical reasoning

data synthesis

constraint graphs

logical complexity

mode collapse

Innovation

Methods, ideas, or system contributions that make the work stand out.

constraint graph

adversarial evolution

Legislator-Executor paradigm