FormalEvolve: Neuro-Symbolic Evolutionary Search for Diverse and Prover-Effective Autoformalization

📅 2026-03-20
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the challenge that, despite semantic equivalence, automatically formalized natural language mathematical statements exhibit substantial variability in their efficacy for theorem provers, thereby limiting downstream proof success rates. The authors frame this task as a test-time search problem under a constrained query budget and propose a compiled, gated neuro-symbolic evolutionary framework. This framework integrates large language model–driven mutation and crossover operators, bounded patch repair, and abstract syntax tree (AST) symbolic rewriting to generate formalization candidates that are semantically faithful, proof-friendly, and diverse. Evaluated on CombiBench and ProofNet under a query budget of T=100, the approach achieves semantic hit rates of 58.0% and 84.9%, respectively, significantly enhancing provability while reducing result concentration as measured by the Gini index.

Technology Category

Application Category

📝 Abstract
Autoformalization aims to translate natural-language mathematics into compilable, machine-checkable statements. However, semantic consistency does not imply prover effectiveness: even semantically consistent formalizations can differ substantially in proof-search cost and success rate. In this work, we formulate autoformalization as a budgeted, test-time search for semantically consistent repertoires, and propose FormalEvolve, a compilation-gated neuro-symbolic evolutionary framework. FormalEvolve generates diverse candidates via LLM-driven mutation and crossover with bounded patch repair, while symbolic Abstract Syntax Tree (AST) rewrite operations further inject structural diversity. On CombiBench and ProofNet, under a strict generator-call budget of T = 100, FormalEvolve reaches semantic hit rates (SH@100) of 58.0% and 84.9%, and reduces cross-problem concentration of semantic successes(lower Gini). Under a fixed prover budget, FormalEvolve also improves downstream proving performance on CombiBench. Code will be released publicly.
Problem

Research questions and friction points this paper is trying to address.

autoformalization
prover effectiveness
semantic consistency
theorem proving
formal mathematics
Innovation

Methods, ideas, or system contributions that make the work stand out.

neuro-symbolic
evolutionary search
autoformalization
AST rewriting
prover effectiveness
🔎 Similar Papers
No similar papers found.
H
Haijian Lu
School of Artificial Intelligence, Xidian University, Xi’an, China
Wei Wang
Wei Wang
Beijing Jiaotong University, University of Trento, EPFL
Computer VisionAugmented RealityMachine Learning
J
Jing Liu
School of Artificial Intelligence, Xidian University, Xi’an, China