NavRL++: A System-Level Framework for Improving Sim-to-Real Transfer in Reinforcement Learning-Based Robot Navigation

📅 2026-05-15

📈 Citations: 0

✨ Influential: 0

career value

221K/year

🤖 AI Summary

This work addresses the performance degradation commonly observed in sim-to-real transfer of reinforcement learning for robotic navigation, which stems from domain discrepancies and a lack of systematic analysis linking training strategies to deployment outcomes. The authors propose an end-to-end training and deployment pipeline that decouples key influencing factors, introducing perturbation-aware fine-tuning and a Transformer-based temporal reasoning policy to significantly enhance zero-shot transfer robustness and control smoothness. By integrating perturbation modeling, post-training fine-tuning, and system-level domain gap analysis, the method outperforms existing learning-based baselines in both static and dynamic environments, matching the performance of optimization-based planners in static scenes and achieving successful zero-shot deployment across multiple real-world robotic platforms.

📝 Abstract

Recent years have witnessed significant progress in autonomous navigation using reinforcement learning. However, existing approaches largely emphasize reinforcement learning framework design, such as input representations, action spaces, and reward functions, while providing limited analysis of sim-to-real transfer and insufficient insight into how training strategies affect real-world deployment performance. To bridge this gap, we not only introduce an effective RL framework but also present a complete training and deployment pipeline, along with a systematic empirical study that disentangles the key factors affecting sim-to-real transfer in reinforcement learning-based navigation, including sensor noise, perception failures, system latency, and control response. Building on insights from this analysis, we introduce perturbation-aware fine-tuning, a post-training adaptation strategy that improves transfer robustness by explicitly accounting for empirically identified domain discrepancies. To further mitigate perception degradation and enhance control smoothness in real-world deployment, we propose a Transformer-based temporal reasoning policy that leverages short-horizon observation for navigation control. We quantitatively evaluate how individual sim-to-real perturbations and training design choices impact navigation performance across environments. Experimental results demonstrate that the proposed training strategy and policy architecture outperform learning-based baselines in both static and dynamic environments, while achieving performance comparable to optimization-based planners in static settings. We validate our approach through real-world deployment on multiple robotic platforms, including aerial and legged robots, across navigation-centric tasks such as exploration and inspection, demonstrating zero-shot sim-to-real transfer.

Problem

Research questions and friction points this paper is trying to address.

sim-to-real transfer

reinforcement learning

robot navigation

domain discrepancy

real-world deployment

Innovation

Methods, ideas, or system contributions that make the work stand out.

sim-to-real transfer

perturbation-aware fine-tuning

Transformer-based policy