LaSER: How Learning Can Guide the Evolution of Equations

📅 2025-05-22

📈 Citations: 0

✨ Influential: 0

career value

196K/year

🤖 AI Summary

In symbolic regression, conventional genetic programming (GP) suffers from poor generalization and struggles to balance accuracy with symbolic interpretability. To address this, we propose Semantic-Guided Supervised GP (SG-GP), which embeds supervised learning into the fitness evaluation stage—without altering syntax tree representations or evolutionary operators. SG-GP constructs differentiable surrogate objectives from input-output behavioral semantics and leverages linear or kernel regression to provide gradient-informed evolutionary guidance. Our key contribution is the first integration of behavior-level supervised learning into non-differentiable symbolic evolution, decoupling learning from search while preserving full symbolic interpretability throughout optimization. Experiments on standard symbolic regression benchmarks demonstrate that SG-GP significantly outperforms classical GP and achieves generalization performance comparable to—or even exceeding—that of black-box regression models such as XGBoost and Random Forest, while producing exact, human-readable mathematical expressions.

Technology Category

Application Category

📝 Abstract

Evolution and learning are two distinct yet complementary forms of adaptation. While evolutionary processes operate across generations via the selection of genotypes, learning occurs within the lifetime of an individual, shaping behavior through phenotypic adjustment. The Baldwin effect describes how lifetime learning can improve evolutionary search without altering inherited structures. While this has proven effective in areas like neuroevolution, where gradient-based learning is often used to fine-tune weights or behaviors produced by evolution, it remains underexplored in systems that evolve non-differentiable symbolic structures like Genetic Programming (GP). GP evolves explicit syntax trees that represent equations, offering strong interpretability but limited generalization due to the burden of discovering both useful representations and precise mappings. Here, we show for the first time that integrating a simple form of supervised learning, applied at the semantic or behavioral level during evaluation, can effectively guide the evolution of equations in GP. To achieve this, we propose a new GP pipeline, LaSER (Latent Semantic Evolutionary Regression), where each GP individual generates a semantic representation that is passed to a supervised learner. The quality of the learned mapping is used to assign fitness, without modifying the underlying syntax tree or evolutionary process. Across standard symbolic regression benchmarks, in terms of generalization ability, LaSER significantly outperforms traditional GP and, in several cases, matches or exceeds popular machine learning regressors, while preserving the symbolic interpretability. By separating evolution from learning, LaSER offers a practical route to integrating GP with modern ML workflows, and opens new avenues for research at the intersection of evolutionary computation and representation learning.

Problem

Research questions and friction points this paper is trying to address.

Guiding evolution of equations in Genetic Programming using learning

Improving generalization in symbolic regression without losing interpretability

Integrating supervised learning with evolutionary computation for better performance

Innovation

Methods, ideas, or system contributions that make the work stand out.

Integrates supervised learning with Genetic Programming

Uses semantic representations for fitness evaluation

Preserves symbolic interpretability while improving generalization

🔎 Similar Papers

LLM-SR: Scientific Equation Discovery via Programming with Large Language Models