SafeRL-Lite: A Lightweight, Explainable, and Constrained Reinforcement Learning Library

📅 2025-06-17

📈 Citations: 0

✨ Influential: 0

career value

145K/year

🤖 AI Summary

Existing reinforcement learning (RL) toolkits generally lack native support for hard safety constraints and interpretable decision-making mechanisms. To address this, we propose the first lightweight, open-source library that deeply integrates SHAP values and Grad-CAM saliency maps into the constrained RL training pipeline—specifically within the DQN framework—to jointly optimize safety and interpretability. Our approach employs custom constraint-aware reward shaping and Gym environment wrappers, enabling zero-modification integration with existing RL codebases. It supports real-time attribution of decisions and quantitative violation analysis. Evaluated on a safety-constrained CartPole variant, our method reduces safety violation rates to <0.3%, generates visually verifiable attribution maps, achieves sub-8ms per-step inference latency, and enables one-command deployment via pip. This work bridges critical gaps in trustworthy, production-ready constrained RL.

Technology Category

Application Category

📝 Abstract

We introduce SafeRL-Lite, an open-source Python library for building reinforcement learning (RL) agents that are both constrained and explainable. Existing RL toolkits often lack native mechanisms for enforcing hard safety constraints or producing human-interpretable rationales for decisions. SafeRL-Lite provides modular wrappers around standard Gym environments and deep Q-learning agents to enable: (i) safety-aware training via constraint enforcement, and (ii) real-time post-hoc explanation via SHAP values and saliency maps. The library is lightweight, extensible, and installable via pip, and includes built-in metrics for constraint violations. We demonstrate its effectiveness on constrained variants of CartPole and provide visualizations that reveal both policy logic and safety adherence. The full codebase is available at: https://github.com/satyamcser/saferl-lite.

Problem

Research questions and friction points this paper is trying to address.

Enforcing hard safety constraints in RL agents

Providing human-interpretable decision rationales in RL

Lack of lightweight, explainable RL toolkits with safety

Innovation

Methods, ideas, or system contributions that make the work stand out.

Lightweight Python library for constrained RL

Modular wrappers enable safety-aware training

Real-time explanations via SHAP and saliency

🔎 Similar Papers

Balance Reward and Safety Optimization for Safe Reinforcement Learning: A Perspective of Gradient Manipulation