🤖 AI Summary
This work addresses autonomous exploration in obstacle-dense environments. We propose a platform-agnostic reinforcement learning framework that jointly integrates a graph attention network (GAT) for waypoint selection, a potential-field-inspired reward function for behavior guidance, and a safety-aware action filtering mechanism for real-time motion correction—thereby balancing exploration efficiency and collision avoidance. Our key contributions are: (i) the first application of GATs to exploration decision-making; (ii) synergistic integration of physics-informed potential-field rewards with a lightweight safety filter, enhancing policy robustness while reducing corrective interventions; and (iii) empirical validation across simulation and real-world robotic platforms. Results demonstrate a 72% average reduction in collision rate, a 38% improvement in unknown-area coverage speed, and seamless cross-platform deployment capability.
📝 Abstract
Autonomous exploration of obstacle-rich spaces requires strategies that ensure efficiency while guaranteeing safety against collisions with obstacles. This paper investigates a novel platform-agnostic reinforcement learning framework that integrates a graph neural network-based policy for next-waypoint selection, with a safety filter ensuring safe mobility. Specifically, the neural network is trained using reinforcement learning through the Proximal Policy Optimization (PPO) algorithm to maximize exploration efficiency while minimizing safety filter interventions. Henceforth, when the policy proposes an infeasible action, the safety filter overrides it with the closest feasible alternative, ensuring consistent system behavior. In addition, this paper introduces a reward function shaped by a potential field that accounts for both the agent's proximity to unexplored regions and the expected information gain from reaching them. The proposed framework combines the adaptability of reinforcement learning-based exploration policies with the reliability provided by explicit safety mechanisms. This feature plays a key role in enabling the deployment of learning-based policies on robotic platforms operating in real-world environments. Extensive evaluations in both simulations and experiments performed in a lab environment demonstrate that the approach achieves efficient and safe exploration in cluttered spaces.