Search Inspired Exploration in Reinforcement Learning

📅 2026-01-31

📈 Citations: 0

✨ Influential: 0

career value

230K/year

🤖 AI Summary

This work addresses the challenge of inefficient exploration in reinforcement learning under sparse reward settings by proposing a frontier-based subgoal-guided exploration method. The approach identifies reachable and novel regions at the boundary of known states and dynamically selects the most informative subgoals according to learning progress. It further optimizes exploration trajectories using cost-to-come and cost-to-go heuristics, eliminating the need for handcrafted exploration rules and mitigating convergence to suboptimal policies. Empirical results demonstrate that the proposed framework significantly outperforms state-of-the-art baselines across multiple sparse-reward tasks, accelerating task completion while exhibiting strong generalization to arbitrary goal states.

Technology Category

Application Category

📝 Abstract

Exploration in environments with sparse rewards remains a fundamental challenge in reinforcement learning (RL). Existing approaches such as curriculum learning and Go-Explore often rely on hand-crafted heuristics, while curiosity-driven methods risk converging to suboptimal policies. We propose Search-Inspired Exploration in Reinforcement Learning (SIERL), a novel method that actively guides exploration by setting sub-goals based on the agent's learning progress. At the beginning of each episode, SIERL chooses a sub-goal from the \textit{frontier} (the boundary of the agent's known state space), before the agent continues exploring toward the main task objective. The key contribution of our method is the sub-goal selection mechanism, which provides state-action pairs that are neither overly familiar nor completely novel. Thus, it assures that the frontier is expanded systematically and that the agent is capable of reaching any state within it. Inspired by search, sub-goals are prioritized from the frontier based on estimates of cost-to-come and cost-to-go, effectively steering exploration towards the most informative regions. In experiments on challenging sparse-reward environments, SIERL outperforms dominant baselines in both achieving the main task goal and generalizing to reach arbitrary states in the environment.

Problem

Research questions and friction points this paper is trying to address.

sparse rewards

reinforcement learning

exploration

sub-optimal policies

curriculum learning

Innovation

Methods, ideas, or system contributions that make the work stand out.

search-inspired exploration

sub-goal selection

sparse-reward reinforcement learning