🤖 AI Summary
Active learning (AL) for real-time safety-critical decision-making faces a fundamental trade-off between query latency and safety assurance, as conventional iterative model updating and acquisition optimization cannot meet millisecond-scale inference requirements under strict safety constraints.
Method: We propose the first amortized safe AL framework, replacing iterative optimization with a neural policy network that directly maps observations to safe data queries. The policy is pretrained exclusively on synthetic data—requiring no real-world interaction—and supports seamless switching between safety-critical and non-safety-critical modes. Our approach integrates nonparametric simulation modeling, safety-aware acquisition objectives, and a modular AL architecture.
Results: Evaluated across multiple physical systems, our method achieves >100× speedup in inference latency while retaining ≥98% of standard AL performance, zero safety violations, and strong generalization to unknown dynamical systems.
📝 Abstract
Active Learning (AL) is a sequential learning approach aiming at selecting the most informative data for model training. In many systems, safety constraints appear during data evaluation, requiring the development of safe AL methods. Key challenges of AL are the repeated model training and acquisition optimization required for data selection, which become particularly restrictive under safety constraints. This repeated effort often creates a bottleneck, especially in physical systems requiring real-time decision-making. In this paper, we propose a novel amortized safe AL framework. By leveraging a pretrained neural network policy, our method eliminates the need for repeated model training and acquisition optimization, achieving substantial speed improvements while maintaining competitive learning outcomes and safety awareness. The policy is trained entirely on synthetic data utilizing a novel safe AL objective. The resulting policy is highly versatile and adapts to a wide range of systems, as we demonstrate in our experiments. Furthermore, our framework is modular and we empirically show that we also achieve superior performance for unconstrained time-sensitive AL tasks if we omit the safety requirement.