Learned Lyapunov Shielding for Adaptive Control

📅 2026-05-07

📈 Citations: 0

✨ Influential: 0

career value

207K/year

🤖 AI Summary

This work addresses the problem of safe adaptive control for Euler–Lagrange systems subject to unmodeled dynamics and uncertainties. It proposes a closed-form safety filtering mechanism that obviates the need for online quadratic programming. The approach integrates Cholesky-parameterized quadratic Lyapunov functions, Soft Actor-Critic reinforcement learning, physics-informed neural networks (PINNs), and a Lyapunov derivative-based safety projection to enforce stability constraints and correct control inputs in real time. Theoretical analysis establishes the filter’s global feasibility, exponential stability, and KKT convergence of the underlying three-timescale algorithm, along with a generalization error bound for the learned certificates. Experimental results demonstrate a 41% reduction in tracking error under nominal friction and 24% under high friction on a 2-DOF manipulator, while tests on a 7-DOF Franka Panda platform confirm the method’s industrial-scale applicability and convergence.

📝 Abstract

We augment the Slotine--Li adaptive controller for Euler--Lagrange systems with three learned components: a structured-quadratic Lyapunov function \(V_ψ\) whose positive-definiteness follows from a Cholesky parameterization, a residual Soft Actor--Critic policy that adds bounded torque corrections to the analytic baseline, and a physics-informed neural network that estimates unmodeled dynamics. A closed-form safety filter, derived from the single affine constraint \(\dot V_ψ+ αV_ψ\le 0\), projects every policy output onto the safe set without requiring an online QP solver. We prove: global feasibility of the filter under a drift-decay condition on the control-degeneracy set; exponential stability under exact shielding, with a robust extension whose margin depends on the PINN approximation error; almost-sure convergence of the three-timescale policy--certificate--multiplier updates to a KKT point; and a PAC generalization bound for the certificate over compacts. On a 2-DOF manipulator with nonlinear friction and variable payload, the learned certificate accounts for most of the empirical gain: tracking error drops by 41\% on nominal friction and 24\% on aggressive friction at the centroid of the training distribution. A 7-DOF scalability study on a Franka Emika Panda confirms clean convergence of the full pipeline at industrial scale, identifies the conditions under which gains over exact model-based baselines should and should not be expected, and documents a warm-start pathology of the learned certificate that has practical implications for deployment.

Problem

Research questions and friction points this paper is trying to address.

adaptive control

safety

stability

unmodeled dynamics

Euler-Lagrange systems

Innovation

Methods, ideas, or system contributions that make the work stand out.

Learned Lyapunov Shielding

Physics-Informed Neural Network

Adaptive Control