π€ AI Summary
This work addresses the challenges of insufficient simulation coverage due to regional variations in real-world social navigation and the limited computational resources and low learning efficiency on edge devices. To tackle these issues, the authors propose a lightweight online learning framework that integrates incremental learning with residual reinforcement learning. The method eliminates the need for a replay buffer and achieves efficient adaptation through residual policy optimization built upon a base policy. By innovatively combining incremental learning with residual reinforcement learning, the approach matches the performance of conventional replay-buffer-based methods in simulation and outperforms existing incremental learning strategies. Real-world experiments further demonstrate the robotβs rapid adaptation to unseen environments, significantly enhancing generalization and learning efficiency under edge deployment constraints.
π Abstract
As the demand for mobile robots continues to increase, social navigation has emerged as a critical task, driving active research into deep reinforcement learning (RL) approaches. However, because pedestrian dynamics and social conventions vary widely across different regions, simulations cannot easily encompass all possible real-world scenarios. Real-world RL, in which agents learn while operating directly in physical environments, presents a promising solution to this issue. Nevertheless, this approach faces significant challenges, particularly regarding constrained computational resources on edge devices and learning efficiency. In this study, we propose incremental residual RL (IRRL). This method integrates incremental learning, which is a lightweight process that operates without a replay buffer or batch updates, with residual RL, which enhances learning efficiency by training only on the residuals relative to a base policy. Through the simulation experiments, we demonstrated that, despite lacking a replay buffer, IRRL achieved performance comparable to those of conventional replay buffer-based methods and outperformed existing incremental learning approaches. Furthermore, the real-world experiments confirmed that IRRL can enable robots to effectively adapt to previously unseen environments through the real-world learning.