Safe Reinforcement Learning in Black-Box Environments via Adaptive Shielding

📅 2024-05-28
🏛️ arXiv.org
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address the lack of safety guarantees during training of reinforcement learning agents in unknown black-box environments, this paper proposes ADVICE: a model-free, online, and interpretable adaptive action masking framework. Its core innovation lies in employing a contrastive autoencoder to achieve unsupervised disentanglement of safety-critical features, coupled with runtime action masking and online estimation of safety boundaries—enabling real-time identification and suppression of hazardous state-action pairs without prior domain knowledge. ADVICE maintains competitive task performance while substantially improving training safety: experiments demonstrate approximately a 50% reduction in safety violations, with task reward matching that of current state-of-the-art methods. Crucially, ADVICE is the first approach to jointly optimize safety supervision and policy learning in black-box settings, establishing a reliable safety-aware training paradigm for real-world deployment.

Technology Category

Application Category

📝 Abstract
Empowering safe exploration of reinforcement learning (RL) agents during training is a critical challenge towards their deployment in many real-world scenarios. When prior knowledge of the domain or task is unavailable, training RL agents in unknown, extit{black-box} environments presents an even greater safety risk. We introduce mbox{ADVICE} (Adaptive Shielding with a Contrastive Autoencoder), a novel post-shielding technique that distinguishes safe and unsafe features of state-action pairs during training, and uses this knowledge to protect the RL agent from executing actions that yield likely hazardous outcomes. Our comprehensive experimental evaluation against state-of-the-art safe RL exploration techniques shows that ADVICE significantly reduces safety violations ($approx!!50%$) during training, with a competitive outcome reward compared to other techniques.
Problem

Research questions and friction points this paper is trying to address.

Reinforcement Learning
Safety Assurance
Unknown Environments
Innovation

Methods, ideas, or system contributions that make the work stand out.

ADVICE
Safe Reinforcement Learning
Robotics Training
🔎 Similar Papers
No similar papers found.
D
Daniel Bethell
University of York, UK
Simos Gerasimou
Simos Gerasimou
Associate Professor (Senior Lecturer) in Computer Science, University of York
Self-Adaptive SystemsSoftware EngineeringAI Safety
R
R. Calinescu
University of York, UK
C
Calum Imrie
University of York, UK