🤖 AI Summary
Traditional symbolic regression methods often suffer from ambiguous search directions and convergence difficulties in vast expression spaces due to their reliance on fitting error. This work proposes a goal-conditioned reinforcement learning framework that abandons the error-driven paradigm, instead leveraging an action-value network combined with Hindsight Experience Replay (HER) to generalize input-output mapping patterns. The approach introduces a binary reward mechanism based on full-point satisfaction and a structure-guided exploration strategy. These innovations substantially enhance the robustness and diversity of the search process, enabling more efficient and accurate recovery of complex mathematical expressions under identical computational budgets on established benchmarks, with a recovery rate surpassing current state-of-the-art methods.
📝 Abstract
Symbolic Regression aims to automatically identify compact and interpretable mathematical expressions that model the functional relationship between input and output variables. Most existing search-based symbolic regression methods typically rely on the fitting error to inform the search process. However, in the vast expression space, numerous candidate expressions may exhibit similar error values while differing substantially in structure, leading to ambiguous search directions and hindering convergence to the underlying true function. To address this challenge, we propose a novel framework named EGRL-SR (Experience-driven Goal-conditioned Reinforcement Learning for Symbolic Regression). In contrast to traditional error-driven approaches, EGRL-SR introduces a new perspective: leveraging precise historical trajectories and optimizing the action-value network to proactively guide the search process, thereby achieving a more robust expression search. Specifically, we formulate symbolic regression as a goal-conditioned reinforcement learning problem and incorporate hindsight experience replay, allowing the action-value network to generalize common mapping patterns from diverse input-output pairs. Moreover, we design an all-point satisfaction binary reward function that encourages the action-value network to focus on structural patterns rather than low-error expressions, and concurrently propose a structure-guided heuristic exploration strategy to enhance search diversity and space coverage. Experiments on public benchmarks show that EGRL-SR consistently outperforms state-of-the-art methods in recovery rate and robustness, and can recover more complex expressions under the same search budget. Ablation results validate that the action-value network effectively guides the search, with both the reward function and the exploration strategy playing critical roles.