CAPSULE: Control-Theoretic Action Perturbations for Safe Uncertainty-Aware Reinforcement Learning

📅 2026-04-26

📈 Citations: 0

✨ Influential: 0

career value

184K/year

🤖 AI Summary

This work addresses the challenge of safe exploration in high-dimensional systems with unknown dynamics, where existing approaches offer only probabilistic safety guarantees in expectation and often fail to prevent safety violations in practice. To overcome this limitation, the authors propose a novel method that first learns a probabilistic control-affine dynamics model offline and explicitly incorporates model uncertainty into the design of Control Barrier Functions (CBFs), yielding conservative yet feasible safety constraints. During online execution, the CBF-based framework corrects policy actions to enforce hard safety guarantees without requiring prior knowledge of the true system dynamics. Empirical evaluations on multiple nonlinear continuous-control benchmarks demonstrate that the proposed approach significantly reduces safety violations while maintaining task performance comparable to state-of-the-art baselines.

Technology Category

Application Category

📝 Abstract

Ensuring safe exploration in high-dimensional systems with unknown dynamics remains a significant challenge. Existing safe reinforcement learning methods often provide safety guarantees only in expectation, which can still lead to safety violations. Control-theoretic approaches, in contrast, offer hard constraint-based safety guarantees but typically assume access to known system dynamics or require accurate estimation of control-affine models. In this paper, we propose a safe reinforcement learning framework that learns a probabilistic control-affine dynamics model in an offline setting. The learned model is leveraged to explicitly construct control barrier functions (CBFs) that incorporate model uncertainty to provide conservative safety constraints. These CBF constraints are enforced through an online constraint-based action correction mechanism, enabling safe exploration without overly restricting task performance. Empirical evaluations on nonlinear, complex continuous-control benchmarks demonstrate that our approach achieves returns comparable to those of existing baselines while significantly reducing safety violations.

Problem

Research questions and friction points this paper is trying to address.

safe reinforcement learning

unknown dynamics

control barrier functions

model uncertainty

high-dimensional systems

Innovation

Methods, ideas, or system contributions that make the work stand out.

control barrier functions

probabilistic dynamics model

safe reinforcement learning