Efficient Reinforcement Learning using Linear Koopman Dynamics for Nonlinear Robotic Systems

📅 2026-04-21

📈 Citations: 0

✨ Influential: 0

career value

208K/year

🤖 AI Summary

This work addresses the low sample efficiency of reinforcement learning and the compounding errors in model prediction for nonlinear robotic systems by proposing a Koopman operator–based linear lifted dynamic model embedded within an Actor-Critic architecture. The approach enables efficient on-policy optimization through single-step prediction–based policy gradient estimation. By innovatively integrating Koopman-based linearization with mini-batch policy gradients, the method significantly improves sample efficiency and reduces computational overhead while maintaining competitive control performance. Experimental results demonstrate that the proposed algorithm achieves superior sample efficiency compared to model-free reinforcement learning methods and matches the control performance of classical model-based strategies that rely on accurate dynamics, across multiple simulation benchmarks as well as real-world platforms including the Kinova Gen3 manipulator and the Unitree Go1 quadruped robot.

Technology Category

Application Category

📝 Abstract

This paper presents a model-based reinforcement learning (RL) framework for optimal closed-loop control of nonlinear robotic systems. The proposed approach learns linear lifted dynamics through Koopman operator theory and integrates the resulting model into an actor-critic architecture for policy optimization, where the policy represents a parameterized closed-loop controller. To reduce computational cost and mitigate model rollout errors, policy gradients are estimated using one-step predictions of the learned dynamics rather than multi-step propagation. This leads to an online mini-batch policy gradient framework that enables policy improvement from streamed interaction data. The proposed framework is evaluated on several simulated nonlinear control benchmarks and two real-world hardware platforms, including a Kinova Gen3 robotic arm and a Unitree Go1 quadruped. Experimental results demonstrate improved sample efficiency over model-free RL baselines, superior control performance relative to model-based RL baselines, and control performance comparable to classical model-based methods that rely on exact system dynamics.

Problem

Research questions and friction points this paper is trying to address.

Reinforcement Learning

Nonlinear Robotic Systems

Sample Efficiency

Closed-loop Control

Model-based RL

Innovation

Methods, ideas, or system contributions that make the work stand out.

Koopman operator

model-based reinforcement learning

linear lifted dynamics