π€ AI Summary
Large language models (LLMs) struggle to guarantee correctness in formal rewriting, particularly in generating reliable and complete rewriting chains. To address this, we propose LGuessβa hybrid framework that tightly couples LLMs with equality saturation. LGuess employs e-graphs as an intermediate representation: the LLM generates high-level semantic checkpoints (e.g., key factorization steps), while a probabilistic checkpoint extraction mechanism automatically completes the low-level equational derivation chain within the e-graph. This design avoids the unreliability of end-to-end rewriting-chain generation by synergistically integrating high-level semantic guidance with rigorous, bottom-up formal reasoning. Evaluated on multivariate polynomial factorization, LGuess significantly improves rewriting success rates and solving efficiency over both pure equality saturation and direct LLM-based rewriting approaches.
π Abstract
One critical issue with large language models (LLMs) is their inability to guarantee correctness. Although this problem can be addressed by applying LLMs to formal rewrite systems, current LLMs are still far from adequate to generate sound rewrite chains. To bridge this gap, this paper proposes LLM-guided equality saturation, dubbed LGuess, by incorporating e-graphs as an intermediate layer between LLMs and rewrite systems. LGuess queries LLMs only for high-level rewrite checkpoints and uses e-graphs to supply low-level rewrite chains between these checkpoints. The key technical challenge in this procedure lies in effectively extracting a suitable checkpoint from a saturated e-graph, which LGuess addresses by learning a probabilistic model from the LLM. The model predicts probable checkpoints while remaining simple enough for effective extraction. We implement a prototype of LGuess and evaluate it on the problem of factorizing multivariable polynomials. The results demonstrate a significant advantage of LGuess compared to both straightforward equality saturation and the approach that queries the LLM directly for the rewrite chain.