GESR: A Genetic Programming-Based Symbolic Regression Method with Gene Editing

📅 2026-05-11
📈 Citations: 0
Influential: 0
📄 PDF

career value

226K/year
🤖 AI Summary
This work addresses the problem of symbolic regression—automatically discovering mathematical expressions that describe natural phenomena from scientific data—by proposing a novel gene-editing-inspired approach. The method integrates genetic programming with two BERT models, leveraging their masked language modeling capabilities to directionally guide mutation and crossover operations, thereby overcoming the limitations of traditional genetic algorithms that rely on random perturbations. Experimental results demonstrate that this approach significantly enhances evolutionary efficiency and achieves superior overall performance compared to conventional genetic programming across multiple symbolic regression benchmark tasks.
📝 Abstract
Mathematical formulas serve as a language through which humans communicate with nature. Discovering mathematical laws from scientific data to describe natural phenomena has been a long-standing pursuit of humanity for centuries. In the field of artificial intelligence, this challenge is known as the symbolic regression problem. Among existing symbolic regression approaches, Genetic Programming (GP) based on evolutionary algorithms remains one of the most classical and widely adopted methods. GP simulates the evolutionary process across generations through genetic mutation and crossover. However, mutations and crossovers in GP are entirely random. While this randomness effectively mimics natural evolution, it inevitably produces both beneficial and detrimental variations. If there existed a metaphorical `God` capable of foreseeing which genetic mutations or crossovers would yield superior outcomes and performing targeted gene editing accordingly, the efficiency of evolution could be substantially improved. Motivated by this idea, we propose in this paper a symbolic regression approach based on gene editing, termed GESR. In GESR, we trained two "hands of God" (two BERT models). Among them, the first leverages the BERT's masked language modeling capability to guide the mutation of genes (expression symbols). The other BERT model guides the crossover of individual genes by predicting the crossover point. Experimental results demonstrate that GESR significantly improves computational efficiency compared with traditional GP algorithms and achieves strong overall performance across multiple symbolic regression tasks.
Problem

Research questions and friction points this paper is trying to address.

symbolic regression
genetic programming
gene editing
mathematical discovery
evolutionary algorithms
Innovation

Methods, ideas, or system contributions that make the work stand out.

Symbolic Regression
Genetic Programming
Gene Editing
BERT
Evolutionary Algorithm
🔎 Similar Papers
No similar papers found.