FormalEvolve: Neuro-Symbolic Evolutionary Search for Diverse and Prover-Effective Autoformalization

📅 2026-03-20

📈 Citations: 0

✨ Influential: 0

career value

200K/year

🤖 AI Summary

This work addresses the challenge that, despite semantic equivalence, automatically formalized natural language mathematical statements exhibit substantial variability in their efficacy for theorem provers, thereby limiting downstream proof success rates. The authors frame this task as a test-time search problem under a constrained query budget and propose a compiled, gated neuro-symbolic evolutionary framework. This framework integrates large language model–driven mutation and crossover operators, bounded patch repair, and abstract syntax tree (AST) symbolic rewriting to generate formalization candidates that are semantically faithful, proof-friendly, and diverse. Evaluated on CombiBench and ProofNet under a query budget of T=100, the approach achieves semantic hit rates of 58.0% and 84.9%, respectively, significantly enhancing provability while reducing result concentration as measured by the Gini index.

Technology Category

Application Category

📝 Abstract

Autoformalization aims to translate natural-language mathematics into compilable, machine-checkable statements. However, semantic consistency does not imply prover effectiveness: even semantically consistent formalizations can differ substantially in proof-search cost and success rate. In this work, we formulate autoformalization as a budgeted, test-time search for semantically consistent repertoires, and propose FormalEvolve, a compilation-gated neuro-symbolic evolutionary framework. FormalEvolve generates diverse candidates via LLM-driven mutation and crossover with bounded patch repair, while symbolic Abstract Syntax Tree (AST) rewrite operations further inject structural diversity. On CombiBench and ProofNet, under a strict generator-call budget of T = 100, FormalEvolve reaches semantic hit rates (SH@100) of 58.0% and 84.9%, and reduces cross-problem concentration of semantic successes(lower Gini). Under a fixed prover budget, FormalEvolve also improves downstream proving performance on CombiBench. Code will be released publicly.

Problem

Research questions and friction points this paper is trying to address.

autoformalization

prover effectiveness

semantic consistency

theorem proving

formal mathematics

Innovation

Methods, ideas, or system contributions that make the work stand out.

neuro-symbolic

evolutionary search

autoformalization