Multi-parameter Control for the (1+($lambda$,$lambda$))-GA on OneMax via Deep Reinforcement Learning

📅 2025-05-19
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
The $(1+(lambda,lambda))$ genetic algorithm faces challenges in dynamically coordinating multiple parameters—population size $lambda$, mutation rate, crossover rate, and sampling count—during OneMax optimization. Method: This paper proposes an end-to-end, four-dimensional parameter co-adaptation framework based on deep reinforcement learning (specifically, the Proximal Policy Optimization algorithm). It introduces policy distillation and runtime parameter decoupling to yield an interpretable, lightweight analytical control policy, coupled with fitness-landscape-aware reward shaping. Contribution/Results: To our knowledge, this is the first approach enabling real-time joint optimization of all four core parameters. On a 40,000-dimensional OneMax instance, the learned policy achieves a 27% speedup over the theoretically optimal static configuration and a 13% improvement over irace-based automated parameter tuning—demonstrating both theoretical competitiveness and practical deployability.

Technology Category

Application Category

📝 Abstract
It is well known that evolutionary algorithms can benefit from dynamic choices of the key parameters that control their behavior, to adjust their search strategy to the different stages of the optimization process. A prominent example where dynamic parameter choices have shown a provable super-constant speed-up is the $(1+(lambda,lambda))$ Genetic Algorithm optimizing the OneMax function. While optimal parameter control policies result in linear expected running times, this is not possible with static parameter choices. This result has spurred a lot of interest in parameter control policies. However, many works, in particular theoretical running time analyses, focus on controlling one single parameter. Deriving policies for controlling multiple parameters remains very challenging. In this work we reconsider the problem of the $(1+(lambda,lambda))$ Genetic Algorithm optimizing OneMax. We decouple its four main parameters and investigate how well state-of-the-art deep reinforcement learning techniques can approximate good control policies. We show that although making deep reinforcement learning learn effectively is a challenging task, once it works, it is very powerful and is able to find policies that outperform all previously known control policies on the same benchmark. Based on the results found through reinforcement learning, we derive a simple control policy that consistently outperforms the default theory-recommended setting by $27%$ and the irace-tuned policy, the strongest existing control policy on this benchmark, by $13%$, for all tested problem sizes up to $40{,}000$.
Problem

Research questions and friction points this paper is trying to address.

Dynamic parameter control in evolutionary algorithms for optimization
Multi-parameter control challenges in genetic algorithms
Deep reinforcement learning for optimizing genetic algorithm parameters
Innovation

Methods, ideas, or system contributions that make the work stand out.

Dynamic parameter control via deep reinforcement learning
Decoupling four main parameters for optimization
Outperforming existing policies by significant margins
🔎 Similar Papers
2023-11-09International Conference on Agents and Artificial IntelligenceCitations: 3