Multi-parameter Control for the (1+($lambda$,$lambda$))-GA on OneMax via Deep Reinforcement Learning

📅 2025-05-19

📈 Citations: 0

✨ Influential: 0

career value

210K/year

🤖 AI Summary

The $(1+(lambda,lambda))$ genetic algorithm faces challenges in dynamically coordinating multiple parameters—population size $lambda$, mutation rate, crossover rate, and sampling count—during OneMax optimization. Method: This paper proposes an end-to-end, four-dimensional parameter co-adaptation framework based on deep reinforcement learning (specifically, the Proximal Policy Optimization algorithm). It introduces policy distillation and runtime parameter decoupling to yield an interpretable, lightweight analytical control policy, coupled with fitness-landscape-aware reward shaping. Contribution/Results: To our knowledge, this is the first approach enabling real-time joint optimization of all four core parameters. On a 40,000-dimensional OneMax instance, the learned policy achieves a 27% speedup over the theoretically optimal static configuration and a 13% improvement over irace-based automated parameter tuning—demonstrating both theoretical competitiveness and practical deployability.

Technology Category

Application Category

📝 Abstract

It is well known that evolutionary algorithms can benefit from dynamic choices of the key parameters that control their behavior, to adjust their search strategy to the different stages of the optimization process. A prominent example where dynamic parameter choices have shown a provable super-constant speed-up is the $(1+(lambda,lambda))$ Genetic Algorithm optimizing the OneMax function. While optimal parameter control policies result in linear expected running times, this is not possible with static parameter choices. This result has spurred a lot of interest in parameter control policies. However, many works, in particular theoretical running time analyses, focus on controlling one single parameter. Deriving policies for controlling multiple parameters remains very challenging. In this work we reconsider the problem of the $(1+(lambda,lambda))$ Genetic Algorithm optimizing OneMax. We decouple its four main parameters and investigate how well state-of-the-art deep reinforcement learning techniques can approximate good control policies. We show that although making deep reinforcement learning learn effectively is a challenging task, once it works, it is very powerful and is able to find policies that outperform all previously known control policies on the same benchmark. Based on the results found through reinforcement learning, we derive a simple control policy that consistently outperforms the default theory-recommended setting by $27%$ and the irace-tuned policy, the strongest existing control policy on this benchmark, by $13%$, for all tested problem sizes up to $40{,}000$.

Problem

Research questions and friction points this paper is trying to address.

Dynamic parameter control in evolutionary algorithms for optimization

Multi-parameter control challenges in genetic algorithms

Deep reinforcement learning for optimizing genetic algorithm parameters

Innovation

Methods, ideas, or system contributions that make the work stand out.

Dynamic parameter control via deep reinforcement learning

Decoupling four main parameters for optimization

Outperforming existing policies by significant margins

🔎 Similar Papers

Multi-Agent Quantum Reinforcement Learning using Evolutionary Optimization