EvoNav: Evolutionary Reward Function Design for Robot Navigation with Large Language Models

📅 2026-05-12

📈 Citations: 0

✨ Influential: 0

career value

219K/year

🤖 AI Summary

Handcrafted reward functions in reinforcement learning rely heavily on domain expertise, suffer from poor auditability and limited generalization, and thereby hinder robotic navigation performance. To address these limitations, this work proposes EvoNav, a novel framework that integrates large language models with evolutionary algorithms to automatically evolve navigation reward functions. The approach introduces a three-stage progressive evaluation mechanism—comprising analytical proxy assessment, lightweight interaction, and full-policy training—to efficiently filter candidate reward functions from low- to high-fidelity evaluations. This hierarchical screening significantly enhances both search efficiency and the resulting policy performance. Empirical results demonstrate that EvoNav outperforms both handcrafted rewards and state-of-the-art automated reward design methods in dynamic human environments.

📝 Abstract

Robot navigation is a crucial task with applications to social robots in dynamic human environments. While Reinforcement Learning (RL) has shown great promise for this problem, the policy quality is highly sensitive to the specification of reward functions. Hand-crafted rewards require substantial domain expertise and embed inductive biases that are difficult to audit or adapt, limiting their effectiveness and leading to suboptimal performance. In this paper, we propose EvoNav, an evolutionary framework that automates the design of robot navigation reward functions via large language models (LLMs). To overcome prohibitively costly policy training, EvoNav evaluates each candidate proposal from the LLM via a progressive three-stage warm-up-boost procedure. EvoNav advances from analytical proxies with low-cost surrogates, such as small datasets and analytic rules, to lightweight rollouts and, finally, to full policy training, enabling computationally efficient exploration under effective feedback. Experiment results show that EvoNav produces more effective navigation policies than manually designed RL rewards and state-of-the-art reward design methods.

Problem

Research questions and friction points this paper is trying to address.

robot navigation

reward function design

reinforcement learning

inductive bias

policy performance

Innovation

Methods, ideas, or system contributions that make the work stand out.

evolutionary reward design

large language models

robot navigation