Safe Reinforcement Learning in Black-Box Environments via Adaptive Shielding

📅 2024-05-28

🏛️ arXiv.org

📈 Citations: 0

✨ Influential: 0

career value

213K/year

🤖 AI Summary

To address the lack of safety guarantees during training of reinforcement learning agents in unknown black-box environments, this paper proposes ADVICE: a model-free, online, and interpretable adaptive action masking framework. Its core innovation lies in employing a contrastive autoencoder to achieve unsupervised disentanglement of safety-critical features, coupled with runtime action masking and online estimation of safety boundaries—enabling real-time identification and suppression of hazardous state-action pairs without prior domain knowledge. ADVICE maintains competitive task performance while substantially improving training safety: experiments demonstrate approximately a 50% reduction in safety violations, with task reward matching that of current state-of-the-art methods. Crucially, ADVICE is the first approach to jointly optimize safety supervision and policy learning in black-box settings, establishing a reliable safety-aware training paradigm for real-world deployment.

Technology Category

Application Category

📝 Abstract

Empowering safe exploration of reinforcement learning (RL) agents during training is a critical challenge towards their deployment in many real-world scenarios. When prior knowledge of the domain or task is unavailable, training RL agents in unknown, extit{black-box} environments presents an even greater safety risk. We introduce mbox{ADVICE} (Adaptive Shielding with a Contrastive Autoencoder), a novel post-shielding technique that distinguishes safe and unsafe features of state-action pairs during training, and uses this knowledge to protect the RL agent from executing actions that yield likely hazardous outcomes. Our comprehensive experimental evaluation against state-of-the-art safe RL exploration techniques shows that ADVICE significantly reduces safety violations ($approx!!50%$) during training, with a competitive outcome reward compared to other techniques.

Problem

Research questions and friction points this paper is trying to address.

Reinforcement Learning

Safety Assurance

Unknown Environments

Innovation

Methods, ideas, or system contributions that make the work stand out.

ADVICE

Safe Reinforcement Learning

Robotics Training

🔎 Similar Papers

No similar papers found.