🤖 AI Summary
Deep reinforcement learning (DRL) faces significant challenges in bridging the sim-to-real gap and accurately modeling real-world system latency. Method: This work introduces a low-cost, highly reproducible physical inverted pendulum platform alongside a high-fidelity simulation environment. It systematically decouples and quantifies latency across perception, communication, inference, and actuation stages—implemented using Arduino/Raspberry Pi embedded control, 3D-printed and aluminum extrusion mechanical components, OpenAI Gym API compatibility, and MuJoCo/PyBullet simulation engines. Contributions/Results: (1) A configurable end-to-end latency modeling framework achieving <8% error between simulation and physical deployment; (2) Stable on-device deployment of a DRL swing-up policy on hardware with a $120 BOM cost; (3) Substantially lowering barriers to physical DRL experimentation, establishing a standardized, education-grade validation platform for latency-sensitive control research.
📝 Abstract
Deep reinforcement learning (DRL) has had success in virtual and simulated domains, but due to key differences between simulated and real-world environments, DRL-trained policies have had limited success in real-world applications. To assist researchers to bridge the extit{sim-to-real gap}, in this paper, we describe a low-cost physical inverted pendulum apparatus and software environment for exploring sim-to-real DRL methods. In particular, the design of our apparatus enables detailed examination of the delays that arise in physical systems when sensing, communicating, learning, inferring and actuating. Moreover, we wish to improve access to educational systems, so our apparatus uses readily available materials and parts to reduce cost and logistical barriers. Our design shows how commercial, off-the-shelf electronics and electromechanical and sensor systems, combined with common metal extrusions, dowel and 3D printed couplings provide a pathway for affordable physical DRL apparatus. The physical apparatus is complemented with a simulated environment implemented using a high-fidelity physics engine and OpenAI Gym interface.