Deep QP Safety Filter: Model-free Learning for Reachability-based Safety Filter

📅 2026-01-29

📈 Citations: 0

✨ Influential: 0

🤖 AI Summary

This work addresses the lack of safety guarantees in model-free reinforcement learning for black-box dynamical systems by proposing a fully data-driven safe control framework. Integrating Hamilton-Jacobi reachability analysis with model-free learning, the approach leverages contraction theory to design a tailored loss function that jointly trains two neural networks to approximate the non-smooth safe value function and its derivative. This enables, for the first time, a model-free quadratic programming (QP) safety filter that converges to the viscosity solution of the underlying Hamilton-Jacobi equation. The resulting filter effectively prevents unsafe behaviors during early training stages—even in complex scenarios such as hybrid systems—thereby significantly enhancing learning stability and cumulative reward while outperforming strong existing baselines.

Technology Category

Application Category

📝 Abstract

We introduce Deep QP Safety Filter, a fully data-driven safety layer for black-box dynamical systems. Our method learns a Quadratic-Program (QP) safety filter without model knowledge by combining Hamilton-Jacobi (HJ) reachability with model-free learning. We construct contraction-based losses for both the safety value and its derivatives, and train two neural networks accordingly. In the exact setting, the learned critic converges to the viscosity solution (and its derivative), even for non-smooth values. Across diverse dynamical systems -- even including a hybrid system -- and multiple RL tasks, Deep QP Safety Filter substantially reduces pre-convergence failures while accelerating learning toward higher returns than strong baselines, offering a principled and practical route to safe, model-free control.

Problem

Research questions and friction points this paper is trying to address.

safety filter

model-free learning

reachability

black-box dynamical systems

safe reinforcement learning

Innovation

Methods, ideas, or system contributions that make the work stand out.

Deep QP Safety Filter

model-free learning

Hamilton-Jacobi reachability

safety-critical control

viscosity solution

🔎 Similar Papers

No similar papers found.

Authors to Follow