V-OCBF: Learning Safety Filters from Offline Data via Value-Guided Offline Control Barrier Functions

📅 2025-12-11
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This paper addresses offline safety-critical control for autonomous systems without access to system dynamics models, online interaction, or expert-designed control barrier functions (CBFs). Method: We propose a model-free neural CBF learning framework featuring a value-guided recursive finite-difference barrier update mechanism, integrated with expectile regression and offline action-set constraints. The approach jointly learns neural CBFs, enforces expectile-based safety constraints, and synthesizes real-time quadratic-programming (QP)-based safe controllers—ensuring forward invariance of the learned safe set under distributional shift limitations. Contribution/Results: Our method achieves provably safe, in-distribution CBF learning without modeling assumptions or expert supervision. Empirical evaluation across multiple tasks demonstrates substantial reductions in safety violations while preserving high task performance, enabling high-assurance deployment of offline-trained safety controllers.

Technology Category

Application Category

📝 Abstract
Ensuring safety in autonomous systems requires controllers that satisfy hard, state-wise constraints without relying on online interaction. While existing Safe Offline RL methods typically enforce soft expected-cost constraints, they do not guarantee forward invariance. Conversely, Control Barrier Functions (CBFs) provide rigorous safety guarantees but usually depend on expert-designed barrier functions or full knowledge of the system dynamics. We introduce Value-Guided Offline Control Barrier Functions (V-OCBF), a framework that learns a neural CBF entirely from offline demonstrations. Unlike prior approaches, V-OCBF does not assume access to the dynamics model; instead, it derives a recursive finite-difference barrier update, enabling model-free learning of a barrier that propagates safety information over time. Moreover, V-OCBF incorporates an expectile-based objective that avoids querying the barrier on out-of-distribution actions and restricts updates to the dataset-supported action set. The learned barrier is then used with a Quadratic Program (QP) formulation to synthesize real-time safe control. Across multiple case studies, V-OCBF yields substantially fewer safety violations than baseline methods while maintaining strong task performance, highlighting its scalability for offline synthesis of safety-critical controllers without online interaction or hand-engineered barriers.
Problem

Research questions and friction points this paper is trying to address.

Ensuring safety in autonomous systems without online interaction
Learning safety filters from offline data without dynamics models
Providing rigorous safety guarantees without hand-engineered barrier functions
Innovation

Methods, ideas, or system contributions that make the work stand out.

Learns neural control barrier functions from offline data
Uses recursive finite-difference barrier update without dynamics model
Combines expectile objective with QP for real-time safe control
🔎 Similar Papers
No similar papers found.
Mumuksh Tayal
Mumuksh Tayal
Indian Institute of Science, Bangalore
Offline RLImitation LearningSafe ControlHardware Aware Algorithms
Manan Tayal
Manan Tayal
Indian Institute of Science (IISc), Bangalore
Safe ControlRobot LearningAIRobotics
A
Aditya Singh
Center for Cyber Physical Systems (CPS), Indian Institute of Science (IISc) Bengaluru
S
Shishir Kolathaya
Center for Cyber Physical Systems (CPS), Indian Institute of Science (IISc) Bengaluru
R
Ravi Prakash
Center for Cyber Physical Systems (CPS), Indian Institute of Science (IISc) Bengaluru