Is Your Model Risk ALARP? Evaluating Prospective Safety-Critical Applications of Complex Models

📅 2025-07-14
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Risk assessment of complex models in safety-critical applications remains challenging due to the difficulty in quantifying trade-offs between performance gains and potential catastrophic failures. Method: This paper proposes a unified risk–benefit assessment framework integrating statistical decision theory, uncertainty quantification, and information value analysis—enabling the first systematic determination of whether model risk satisfies the “as low as reasonably practicable” (ALARP) principle. The framework jointly evaluates safety, efficacy, and emissions-reduction benefits against the severity of erroneous predictions, and establishes an interpretable, end-to-end risk modeling and quantification pipeline. Contribution/Results: Applied to automated weld radiographic image classification, the framework successfully identifies high-risk decision boundaries and provides scientifically grounded deployment criteria, thereby supporting trustworthy machine learning in high-reliability domains.

Technology Category

Application Category

📝 Abstract
The increasing availability of advanced computational modelling offers new opportunities to improve safety, efficacy, and emissions reductions. Application of complex models to support engineering decisions has been slow in comparison to other sectors, reflecting the higher consequence of unsafe applications. Adopting a complex model introduces a emph{model risk}, namely the expected consequence of incorrect or otherwise unhelpful outputs. This should be weighed against the prospective benefits that the more sophisticated model can provide, also accounting for the non-zero risk of existing practice. Demonstrating when the model risk of a proposed machine learning application is As Low As Reasonably Practicable (ALARP) can help ensure that safety-critical industries benefit from complex models where appropriate while avoiding their misuse. An example of automated weld radiograph classification is presented to demonstrate how this can be achieved by combining statistical decision analysis, uncertainty quantification, and value of information.
Problem

Research questions and friction points this paper is trying to address.

Evaluating model risk in safety-critical applications
Balancing benefits and risks of complex models
Ensuring ALARP compliance for machine learning models
Innovation

Methods, ideas, or system contributions that make the work stand out.

Statistical decision analysis for model risk
Uncertainty quantification in safety-critical applications
Value of information for ALARP demonstration
🔎 Similar Papers
No similar papers found.
D
Domenic Di Francesco
The Alan Turing Institute for Artificial Intelligence and Data Science, The British Library, 2QR, John Dodson House, 96 Euston Rd, London NW1 2DB; Department of Civil Engineering, Cambridge University, Trumpington Street, CB2 1PZ; Health and Safety Executive, Harpur Hill, Buxton SK17 9JN
A
Alan Forrest
Credit Research Centre, University of Edinburgh Business School, 29 Buccleuch Place, Edinburgh, EH8 9JS
F
Fiona McGarry
Health and Safety Executive, Harpur Hill, Buxton SK17 9JN
N
Nicholas Hall
Health and Safety Executive, Harpur Hill, Buxton SK17 9JN
Adam Sobey
Adam Sobey
Programme Director, Data-Centric Engineering, The Alan Turing Institute/Professor of Data-Centric
AI/Machine LearningEvolutionary ComputationData-Centric Engineering