HEADER: Hierarchical Robot Exploration via Attention-Based Deep Reinforcement Learning with Expert-Guided Reward

📅 2025-10-17

📈 Citations: 0

✨ Influential: 0

career value

245K/year

🤖 AI Summary

To address low exploration efficiency and poor scalability of autonomous agents in large-scale environments, this paper proposes an attention-based hierarchical reinforcement learning framework. Methodologically: (1) it constructs an incremental, shape-adaptive hierarchical graph structure to enable multi-scale spatial reasoning; (2) it devises a community-aware global graph update algorithm with linear time complexity; and (3) it introduces a parameter-free privileged reward mechanism to eliminate reward shaping bias and guide near-optimal exploration policies. The approach integrates hierarchical graph representation learning, incremental map updating, and multi-scale belief inference. Experiments demonstrate a 20% improvement in exploration efficiency over state-of-the-art methods in large-scale simulations. Furthermore, the framework has been successfully deployed in a real-world campus environment measuring 300 m × 230 m, validating its efficiency, scalability, and practical applicability.

Technology Category

Application Category

📝 Abstract

This work pushes the boundaries of learning-based methods in autonomous robot exploration in terms of environmental scale and exploration efficiency. We present HEADER, an attention-based reinforcement learning approach with hierarchical graphs for efficient exploration in large-scale environments. HEADER follows existing conventional methods to construct hierarchical representations for the robot belief/map, but further designs a novel community-based algorithm to construct and update a global graph, which remains fully incremental, shape-adaptive, and operates with linear complexity. Building upon attention-based networks, our planner finely reasons about the nearby belief within the local range while coarsely leveraging distant information at the global scale, enabling next-best-viewpoint decisions that consider multi-scale spatial dependencies. Beyond novel map representation, we introduce a parameter-free privileged reward that significantly improves model performance and produces near-optimal exploration behaviors, by avoiding training objective bias caused by handcrafted reward shaping. In simulated challenging, large-scale exploration scenarios, HEADER demonstrates better scalability than most existing learning and non-learning methods, while achieving a significant improvement in exploration efficiency (up to 20%) over state-of-the-art baselines. We also deploy HEADER on hardware and validate it in complex, large-scale real-life scenarios, including a 300m*230m campus environment.

Problem

Research questions and friction points this paper is trying to address.

Developing hierarchical reinforcement learning for autonomous robot exploration in large-scale environments

Creating incremental shape-adaptive global graphs with linear complexity for efficient mapping

Designing parameter-free reward functions to avoid training bias and improve exploration efficiency

Innovation

Methods, ideas, or system contributions that make the work stand out.

Hierarchical graphs with attention-based reinforcement learning

Community-based algorithm for incremental global graph updates

Parameter-free privileged reward to avoid training bias

🔎 Similar Papers

LGR2: Language Guided Reward Relabeling for Accelerating Hierarchical Reinforcement Learning