Scaling Up Unbiased Search-based Symbolic Regression

📅 2025-06-24

📈 Citations: 0

✨ Influential: 0

career value

211K/year

🤖 AI Summary

Symbolic regression aims to learn interpretable, generalizable explicit functional expressions from labeled data, bypassing traditional parametric optimization based on basis-function expansions. This paper proposes an unbiased, systematic search method over the space of symbolic expressions: assuming only decomposability as a minimal prior, it performs efficient and robust global exploration by bottom-up construction and evaluation of small-scale subexpressions. The approach significantly improves recovery accuracy of ground-truth generating expressions, outperforms existing state-of-the-art methods across multiple standard benchmarks, and demonstrates superior noise robustness and higher predictive accuracy. Its core innovation lies in decoupling prior constraints from the search mechanism—leveraging structured decomposition to enable reliable discovery of interpretable models.

Technology Category

Application Category

📝 Abstract

In a regression task, a function is learned from labeled data to predict the labels at new data points. The goal is to achieve small prediction errors. In symbolic regression, the goal is more ambitious, namely, to learn an interpretable function that makes small prediction errors. This additional goal largely rules out the standard approach used in regression, that is, reducing the learning problem to learning parameters of an expansion of basis functions by optimization. Instead, symbolic regression methods search for a good solution in a space of symbolic expressions. To cope with the typically vast search space, most symbolic regression methods make implicit, or sometimes even explicit, assumptions about its structure. Here, we argue that the only obvious structure of the search space is that it contains small expressions, that is, expressions that can be decomposed into a few subexpressions. We show that systematically searching spaces of small expressions finds solutions that are more accurate and more robust against noise than those obtained by state-of-the-art symbolic regression methods. In particular, systematic search outperforms state-of-the-art symbolic regressors in terms of its ability to recover the true underlying symbolic expressions on established benchmark data sets.

Problem

Research questions and friction points this paper is trying to address.

Learn interpretable functions from labeled data

Systematically search small symbolic expression spaces

Improve accuracy and robustness in symbolic regression

Innovation

Methods, ideas, or system contributions that make the work stand out.

Systematic search in small expression spaces

Outperforms state-of-the-art symbolic regressors

Focuses on interpretable and accurate functions

🔎 Similar Papers

Discovering physical laws with parallel combinatorial tree search