CRoSS: A Continual Robotic Simulation Suite for Scalable Reinforcement Learning with High Task Diversity and Realistic Physics Simulation

📅 2026-02-04
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the challenges of policy forgetting in continual reinforcement learning and the lack of high-fidelity, diverse robotic simulation benchmarks by introducing an extensible Gazebo-based simulation suite. The platform supports both differential-drive mobile robots and seven-degree-of-freedom manipulators, integrating multimodal sensors—including Lidar, cameras, and collision detection—and offering control interfaces in both joint and Cartesian spaces. A key innovation is the incorporation of a kinematics-only accelerated variant that bypasses full physics simulation, achieving two orders of magnitude improvement in training efficiency while preserving high physical fidelity. Deployed via Apptainer containers, the suite is compatible with standard algorithms such as DQN and policy gradient methods, providing an out-of-the-box, reproducible, and task-diverse experimental environment that establishes a robust benchmark for continual reinforcement learning research.

Technology Category

Application Category

📝 Abstract
Continual reinforcement learning (CRL) requires agents to learn from a sequence of tasks without forgetting previously acquired policies. In this work, we introduce a novel benchmark suite for CRL based on realistically simulated robots in the Gazebo simulator. Our Continual Robotic Simulation Suite (CRoSS) benchmarks rely on two robotic platforms: a two-wheeled differential-drive robot with lidar, camera and bumper sensor, and a robotic arm with seven joints. The former represent an agent in line-following and object-pushing scenarios, where variation of visual and structural parameters yields a large number of distinct tasks, whereas the latter is used in two goal-reaching scenarios with high-level cartesian hand position control (modeled after the Continual World benchmark), and low-level control based on joint angles. For the robotic arm benchmarks, we provide additional kinematics-only variants that bypass the need for physical simulation (as long as no sensor readings are required), and which can be run two orders of magnitude faster. CRoSS is designed to be easily extensible and enables controlled studies of continual reinforcement learning in robotic settings with high physical realism, and in particular allow the use of almost arbitrary simulated sensors. To ensure reproducibility and ease of use, we provide a containerized setup (Apptainer) that runs out-of-the-box, and report performances of standard RL algorithms, including Deep Q-Networks (DQN) and policy gradient methods. This highlights the suitability as a scalable and reproducible benchmark for CRL research.
Problem

Research questions and friction points this paper is trying to address.

Continual Reinforcement Learning
Robotic Simulation
Task Diversity
Realistic Physics
Benchmark Suite
Innovation

Methods, ideas, or system contributions that make the work stand out.

Continual Reinforcement Learning
Robotic Simulation
High Task Diversity
Realistic Physics
Kinematics-only Variants
🔎 Similar Papers
No similar papers found.