Assuring the Safety of Reinforcement Learning Components: AMLAS-RL

📅 2025-07-08
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Safety-critical cyber-physical systems (CPS) lack end-to-end safety assurance for reinforcement learning (RL) components; existing safe-RL approaches lack systematic safety assurance structures, while the Assurance Case–based AMLAS framework is ill-suited to RL’s dynamic decision-making and interactive behavior. Method: This paper proposes AMLAS-RL—a novel assurance framework that systematically adapts the structured assurance argument methodology of AMLAS to RL. It establishes a closed-loop safety evidence chain spanning requirements specification, policy training, verification & validation, and deployment & operation. By integrating RL algorithms, verifiable safety constraints, and an iterative assurance argument generation process, it enables end-to-end traceability from formal safety goals to auditable evidence. Contribution/Results: AMLAS-RL is the first framework to instantiate AMLAS for RL. Its feasibility and practicality are demonstrated through a complete assurance case for obstacle-avoidance navigation in a wheeled mobile robot, achieving full闭环 evidence generation and validation.

Technology Category

Application Category

📝 Abstract
The rapid advancement of machine learning (ML) has led to its increasing integration into cyber-physical systems (CPS) across diverse domains. While CPS offer powerful capabilities, incorporating ML components introduces significant safety and assurance challenges. Among ML techniques, reinforcement learning (RL) is particularly suited for CPS due to its capacity to handle complex, dynamic environments where explicit models of interaction between system and environment are unavailable or difficult to construct. However, in safety-critical applications, this learning process must not only be effective but demonstrably safe. Safe-RL methods aim to address this by incorporating safety constraints during learning, yet they fall short in providing systematic assurance across the RL lifecycle. The AMLAS methodology offers structured guidance for assuring the safety of supervised learning components, but it does not directly apply to the unique challenges posed by RL. In this paper, we adapt AMLAS to provide a framework for generating assurance arguments for an RL-enabled system through an iterative process; AMLAS-RL. We demonstrate AMLAS-RL using a running example of a wheeled vehicle tasked with reaching a target goal without collision.
Problem

Research questions and friction points this paper is trying to address.

Ensuring safety in reinforcement learning for cyber-physical systems
Addressing lack of systematic assurance in RL lifecycle
Adapting AMLAS methodology to RL-specific safety challenges
Innovation

Methods, ideas, or system contributions that make the work stand out.

Adapting AMLAS for RL safety assurance
Iterative framework for RL-enabled systems
Safe-RL with systematic lifecycle assurance
C
Calum Corrie Imrie
Department of Computer Science, University of York, York, U.K.
I
Ioannis Stefanakos
Department of Computer Science, University of York, York, U.K.
S
Sepeedeh Shahbeigi
Department of Computer Science, University of York, York, U.K.
Richard Hawkins
Richard Hawkins
Associate Professor, University of York
Simon Burton
Simon Burton
University of York
Safetymachine learningautonomous systemsformal methods and testing