Learned Lyapunov Shielding for Adaptive Control

📅 2026-05-07
📈 Citations: 0
Influential: 0
📄 PDF

career value

207K/year
🤖 AI Summary
This work addresses the problem of safe adaptive control for Euler–Lagrange systems subject to unmodeled dynamics and uncertainties. It proposes a closed-form safety filtering mechanism that obviates the need for online quadratic programming. The approach integrates Cholesky-parameterized quadratic Lyapunov functions, Soft Actor-Critic reinforcement learning, physics-informed neural networks (PINNs), and a Lyapunov derivative-based safety projection to enforce stability constraints and correct control inputs in real time. Theoretical analysis establishes the filter’s global feasibility, exponential stability, and KKT convergence of the underlying three-timescale algorithm, along with a generalization error bound for the learned certificates. Experimental results demonstrate a 41% reduction in tracking error under nominal friction and 24% under high friction on a 2-DOF manipulator, while tests on a 7-DOF Franka Panda platform confirm the method’s industrial-scale applicability and convergence.
📝 Abstract
We augment the Slotine--Li adaptive controller for Euler--Lagrange systems with three learned components: a structured-quadratic Lyapunov function \(V_ψ\) whose positive-definiteness follows from a Cholesky parameterization, a residual Soft Actor--Critic policy that adds bounded torque corrections to the analytic baseline, and a physics-informed neural network that estimates unmodeled dynamics. A closed-form safety filter, derived from the single affine constraint \(\dot V_ψ+ αV_ψ\le 0\), projects every policy output onto the safe set without requiring an online QP solver. We prove: global feasibility of the filter under a drift-decay condition on the control-degeneracy set; exponential stability under exact shielding, with a robust extension whose margin depends on the PINN approximation error; almost-sure convergence of the three-timescale policy--certificate--multiplier updates to a KKT point; and a PAC generalization bound for the certificate over compacts. On a 2-DOF manipulator with nonlinear friction and variable payload, the learned certificate accounts for most of the empirical gain: tracking error drops by 41\% on nominal friction and 24\% on aggressive friction at the centroid of the training distribution. A 7-DOF scalability study on a Franka Emika Panda confirms clean convergence of the full pipeline at industrial scale, identifies the conditions under which gains over exact model-based baselines should and should not be expected, and documents a warm-start pathology of the learned certificate that has practical implications for deployment.
Problem

Research questions and friction points this paper is trying to address.

adaptive control
safety
stability
unmodeled dynamics
Euler-Lagrange systems
Innovation

Methods, ideas, or system contributions that make the work stand out.

Learned Lyapunov Shielding
Physics-Informed Neural Network
Adaptive Control
Safety Filter
Exponential Stability