🤖 AI Summary
To address high annotation costs and low data efficiency in Deep Active Learning (DAL), this paper proposes a formal verification–based adversarial augmentation method. Specifically, it employs SMT solvers (e.g., ReLU-based) to systematically generate robustness-violating adversarial examples, which are then injected as unlabeled augmented data into the DAL pipeline. This work is the first to integrate formal-verification-driven adversarial sample generation into a DAL framework and introduces a novel query strategy designed to maximize information gain. Theoretical analysis and extensive experiments on CIFAR-10/100 and SVHN demonstrate that the generated samples exhibit higher uncertainty and discriminability than conventional gradient-based adversarial examples. Our method significantly improves model generalization and reduces annotation requirements by 20–35% at equivalent accuracy levels, validating its effectiveness and state-of-the-art performance in data-efficient learning.
📝 Abstract
Deep Active Learning (DAL) aims to reduce labeling costs in neural-network training by prioritizing the most informative unlabeled samples for annotation. Beyond selecting which samples to label, several DAL approaches further enhance data efficiency by augmenting the training set with synthetic inputs that do not require additional manual labeling. In this work, we investigate how augmenting the training data with adversarial inputs that violate robustness constraints can improve DAL performance. We show that adversarial examples generated via formal verification contribute substantially more than those produced by standard, gradient-based attacks. We apply this extension to multiple modern DAL techniques, as well as to a new technique that we propose, and show that it yields significant improvements in model generalization across standard benchmarks.