CircuitBuilder: From Polynomials to Circuits via Reinforcement Learning

📅 2026-03-17
📈 Citations: 0
Influential: 0
📄 PDF

career value

222K/year
🤖 AI Summary
This work addresses the problem of automatically synthesizing efficient arithmetic circuits—comprising only addition and multiplication gates—for computing a given polynomial. The task is formulated as a single-player sequential decision-making game, and an AlphaZero-inspired reinforcement learning framework is employed, integrating both PPO with Monte Carlo Tree Search (PPO+MCTS) and Soft Actor-Critic (SAC) for policy search. As the first study to apply reinforcement learning to arithmetic circuit synthesis, this project introduces a compact, verifiable, and self-improving experimental environment that also serves as a platform for investigating fundamental questions in algebraic complexity theory, such as the VP vs. VNP conjecture. Empirical results demonstrate that SAC achieves the highest success rate on bivariate polynomials, while PPO+MCTS exhibits superior scalability, consistently improving performance on three-variable and more challenging instances.

Technology Category

Application Category

📝 Abstract
Motivated by auto-proof generation and Valiant's VP vs. VNP conjecture, we study the problem of discovering efficient arithmetic circuits to compute polynomials, using addition and multiplication gates. We formulate this problem as a single-player game, where an RL agent attempts to build the circuit within a fixed number of operations. We implement an AlphaZero-style training loop and compare two approaches: Proximal Policy Optimization with Monte Carlo Tree Search (PPO+MCTS) and Soft Actor-Critic (SAC). SAC achieves the highest success rates on two-variable targets, while PPO+MCTS scales to three variables and demonstrates steady improvement on harder instances. These results suggest that polynomial circuit synthesis is a compact, verifiable setting for studying self-improving search policies.
Problem

Research questions and friction points this paper is trying to address.

arithmetic circuits
polynomial computation
circuit synthesis
VP vs. VNP
efficient computation
Innovation

Methods, ideas, or system contributions that make the work stand out.

arithmetic circuit synthesis
reinforcement learning
AlphaZero
polynomial computation
self-improving search