🤖 AI Summary
To address the scalability bottleneck of tree-based genetic programming (TGP) in symbolic regression, feature engineering, and scientific modeling—caused by its compute-intensive nature—this paper introduces the first full-stack GPU-accelerated framework for TGP. Our method introduces three key innovations: (1) a novel tensorized tree encoding that maps heterogeneous tree structures onto fixed-shape tensors; (2) a parallel genetic operations framework built upon shared primitives, enabling dual-granularity parallelism—population-level and data-level—in fitness evaluation; and (3) custom CUDA kernels and a unified operator library, complemented by a multi-task benchmark suite. Experiments demonstrate that our framework achieves up to 140.89× speedup over the state-of-the-art GPU-accelerated TGP methods, while maintaining or improving accuracy. Its effectiveness and generalizability are validated across symbolic regression, classification, and robot control tasks.
📝 Abstract
Tree-based Genetic Programming (TGP) is a key evolutionary algorithm widely used in symbolic regression, feature engineering, and scientific modeling. Its high computational demands make GPU acceleration essential for scalable and high-performance evolutionary computation. However, GPU acceleration of TGP faces three key challenges: inefficient tree encoding, highly heterogeneous genetic operations, and limited parallelism in fitness evaluation. To address these challenges, we introduce EvoGP, a comprehensive GPU-accelerated TGP framework. First, we design a tensorized encoding scheme to represent tree with different structures as tensors with the same shape, optimizing memory access and enabling efficient parallel execution. Second, we propose a unified parallel framework for genetic operations by leveraging shared computational primitives and implementing dedicated CUDA kernels for scalable performance. Third, we present a fully parallel fitness evaluation strategy for symbolic regression, exploiting both population-level and data-level parallelism to maximize GPU utilization. Moreover, we implement a comprehensive library to provide rich algorithm operators and benchmark problems. EvoGP is extensively tested on various tasks, including symbolic regression, classification, and robotics control, demonstrating its versatility and effectiveness across diverse application scenarios. Experimental results show that EvoGP achieves up to a 140.89x speedup over the state-of-the-art GPU-based TGP implementation, while maintaining or exceeding the accuracy of baseline methods. EvoGP is open-source and accessible at: https://github.com/EMI-Group/evogp.