🤖 AI Summary
To address the limitation of deep reinforcement learning (DRL)-based navigation—where agents trained solely on forward actions frequently become trapped in narrow spaces—this paper proposes a novel synergistic framework integrating mirror-augmented experience replay with curriculum learning. Unlike prior approaches, it enables end-to-end bidirectional motion policy learning without relying on failure trajectories or redesigning the reward function. The core innovation is a state-action mirroring mechanism that automatically generates high-quality backward navigation experiences during replay. Curriculum learning then progressively enhances policy robustness by increasing environmental complexity. Evaluated in both ROS simulation and real-world robotic platforms, the method achieves a 42% improvement in backward maneuver success rate and a 31% increase in overall task completion rate over state-of-the-art methods, while preserving forward-performance fidelity. This bridges the gap between traditional planning and learning-based approaches in terms of action-space utilization and environmental adaptability.
📝 Abstract
Deep Reinforcement Learning (DRL) based navigation methods have demonstrated promising results for mobile robots, but suffer from limited action flexibility in confined spaces. Conventional DRL approaches predominantly learn forward-motion policies, causing robots to become trapped in complex environments where backward maneuvers are necessary for recovery. This paper presents MAER-Nav (Mirror-Augmented Experience Replay for Robot Navigation), a novel framework that enables bidirectional motion learning without requiring explicit failure-driven hindsight experience replay or reward function modifications. Our approach integrates a mirror-augmented experience replay mechanism with curriculum learning to generate synthetic backward navigation experiences from successful trajectories. Experimental results in both simulation and real-world environments demonstrate that MAER-Nav significantly outperforms state-of-the-art methods while maintaining strong forward navigation capabilities. The framework effectively bridges the gap between the comprehensive action space utilization of traditional planning methods and the environmental adaptability of learning-based approaches, enabling robust navigation in scenarios where conventional DRL methods consistently fail.