🤖 AI Summary
The genetic programming (GP) community lacks a unified, modular benchmarking framework to enable fair, cross-representation comparison—across tree-, linear-, and graph-based program encodings—across diverse problem domains.
Method: We introduce the first modular, cross-domain benchmarking framework for GP, supporting three canonical tasks: symbolic regression, logic synthesis, and policy search. Built in Python with an embedded tinyGP engine, it decouples program representations from problem domains via standardized interfaces, syntax-aware constraint handling, domain-adaptive execution, and规范化 data loading and result normalization protocols.
Contribution/Results: Evaluated on 12 standard benchmarks, our framework reveals systematic representation-domain preferences, significantly enhancing the rigor, reproducibility, and generalizability of GP evaluation. All code, benchmarks, and documentation are publicly released under an open-source license.
📝 Abstract
Over the years, genetic programming (GP) has evolved, with many proposed variations, especially in how they represent a solution. Being essentially a program synthesis algorithm, it is capable of tackling multiple problem domains. Current benchmarking initiatives are fragmented, as the different representations are not compared with each other and their performance is not measured across the different domains. In this work, we propose a unified framework, dubbed TinyverseGP (inspired by tinyGP), which provides support to multiple representations and problem domains, including symbolic regression, logic synthesis and policy search.