Draw2Think: Harnessing Geometry Reasoning through Constraint Engine Interaction

📅 2026-05-20
📈 Citations: 0
Influential: 0
📄 PDF

career value

153K/year
🤖 AI Summary
Existing vision-language models lack verifiable intermediate states in geometric reasoning, making it difficult to ensure precise spatial relationships. This work proposes a "propose–draw–verify" iterative framework that externalizes geometric reasoning through agent-based interaction with the GeoGebra constraint engine. By explicitly expressing hypotheses on an executable canvas and obtaining structured feedback, the approach grounds reasoning in a shared state validated by algebraic constraints. The method enables independent auditing of construction fidelity and measurement faithfulness, achieving 95.9% predicate-level and 84.0% strict problem-level accuracy on the GeoGoal benchmark. It yields performance gains of up to 4.1% and 16.4% in planar and solid geometry tasks, respectively, and attains GenExam-math rendering scores of 68.2% (strict) and 90.5% (lenient).
📝 Abstract
Vision-language models solve geometry problems with rising accuracy, yet their intermediate states remain latent and unverifiable: a relation expressed in textual reasoning or drawing code carries no guarantee that a constraint-satisfying configuration realizes it. We observe that existing externalization methods based on rendered pixels or one-shot scripts fail to provide exact, per-action geometric guarantees. Enforcing geometric relations by algebraic definition closes this gap: the workspace becomes a constraint-checked evolving canvas. We present Draw2Think, a framework that recasts geometric reasoning from latent spatial inference into agentic interaction with the GeoGebra constraint engine. In a Propose-Draw-Verify loop, Draw2Think externalizes hypotheses onto an executable canvas, measures exact geometric quantities, and feeds structured observations back to the model, so subsequent reasoning proceeds from checked canvas state grounded by the shared workspace. This externalization makes two properties separately auditable: model-level Construction Fidelity (whether the canvas realizes the intended configuration) and engine-level Measurement Faithfulness (exact values and relations from canvas constraints). Across construction, outcome, and rendering evaluations, Draw2Think builds canvases that pass 95.9% predicate-level and 84.0% strict problem-level construction checks on GeoGoal, improves outcome accuracy by up to 4.1%/16.4% on planar/solid benchmarks, and attains 68.2%/90.5% strict/relaxed rendering scores on GenExam-math. Project page is available at https://draw2think.github.io/
Problem

Research questions and friction points this paper is trying to address.

geometry reasoning
constraint satisfaction
verifiability
externalization
visual-language models
Innovation

Methods, ideas, or system contributions that make the work stand out.

constraint-based reasoning
geometric verification
executable canvas
vision-language models
externalized reasoning