🤖 AI Summary
Deep learning–driven autonomous systems suffer from neural network unreliability due to distributional shift and adversarial attacks, while existing inference frameworks running on general-purpose operating systems exhibit poor real-time performance and inherent security vulnerabilities. To address these challenges, this paper proposes a safety-enhanced architecture based on the Simplex paradigm. It innovatively integrates Type-1 real-time virtualization to establish strongly isolated dual execution environments: a trusted control domain and an untrusted inference domain. A predictable cross-domain communication mechanism and a state-aware fail-safe switching strategy are further designed to ensure timely, safe transitions upon anomaly detection. Experimental evaluation on Furuta pendulum and autonomous vehicle platforms demonstrates that the proposed approach rapidly detects anomalies in learning-based components and seamlessly reverts to a certified safe controller, effectively containing faults induced by neural networks. The solution significantly improves system safety, robustness, and real-time assurance capabilities.
📝 Abstract
Recently, the outstanding performance reached by neural networks in many tasks has led to their deployment in autonomous systems, such as robots and vehicles. However, neural networks are not yet trustworthy, being prone to different types of misbehavior, such as anomalous samples, distribution shifts, adversarial attacks, and other threats. Furthermore, frameworks for accelerating the inference of neural networks typically run on rich operating systems that are less predictable in terms of timing behavior and present larger surfaces for cyber-attacks.
To address these issues, this paper presents a software architecture for enhancing safety, security, and predictability levels of learning-based autonomous systems. It leverages two isolated execution domains, one dedicated to the execution of neural networks under a rich operating system, which is deemed not trustworthy, and one responsible for running safety-critical functions, possibly under a different operating system capable of handling real-time constraints.
Both domains are hosted on the same computing platform and isolated through a type-1 real-time hypervisor enabling fast and predictable inter-domain communication to exchange real-time data. The two domains cooperate to provide a fail-safe mechanism based on a safety monitor, which oversees the state of the system and switches to a simpler but safer backup module, hosted in the safety-critical domain, whenever its behavior is considered untrustworthy.
The effectiveness of the proposed architecture is illustrated by a set of experiments performed on two control systems: a Furuta pendulum and a rover. The results confirm the utility of the fall-back mechanism in preventing faults due to the learning component.