A Performance Analysis of Lexicase-Based and Traditional Selection Methods in GP for Symbolic Regression

📅 2024-07-31

📈 Citations: 1

✨ Influential: 0

career value

190K/year

🤖 AI Summary

This work addresses the co-optimization of selection mechanisms and subsampling strategies in symbolic regression. Under a unified time budget, it systematically compares lexicase selection variants—including ε-lexicase and batched lexicase—with conventional tournament selection, each paired with multiple heuristic subsampling strategies. The evaluation rigorously accounts for both fitness evaluation budgets and real-world wall-clock time constraints. Results show that ε-lexicase combined with subsampling achieves superior evaluation efficiency across settings; batched lexicase excels under extremely tight time budgets; and tournament selection coupled with information-guided subsampling demonstrates the highest robustness and overall efficiency. This study fills an empirical gap regarding lexicase selection in time-sensitive symbolic regression scenarios. It provides reproducible benchmark findings and practical guidance for co-designing selection and subsampling operators in genetic programming.

Technology Category

Application Category

📝 Abstract

In recent years, several new lexicase-based selection variants have emerged due to the success of standard lexicase selection in various application domains. For symbolic regression problems, variants that use an epsilon-threshold or batches of training cases, among others, have led to performance improvements. Lately, especially variants that combine lexicase selection and down-sampling strategies have received a lot of attention. This paper evaluates the most relevant lexicase-based selection methods as well as traditional selection methods in combination with different down-sampling strategies on a wide range of symbolic regression problems. In contrast to most work, we not only compare the methods over a given evaluation budget, but also over a given time budget as time is usually limited in practice. We find that for a given evaluation budget, epsilon-lexicase selection in combination with a down-sampling strategy outperforms all other methods. If the given running time is very short, lexicase variants using batches of training cases perform best. Further, we find that the combination of tournament selection with informed down-sampling performs well in all studied settings.

Problem

Research questions and friction points this paper is trying to address.

Evaluates lexicase-based and traditional selection methods for symbolic regression.

Compares methods over both evaluation and time budgets for practical relevance.

Identifies best-performing selection strategies under different time constraints.

Innovation

Methods, ideas, or system contributions that make the work stand out.

Epsilon-lexicase selection with down-sampling

Lexicase variants using training case batches

Tournament selection with informed down-sampling

🔎 Similar Papers

No similar papers found.