Volatility in Certainty (VC): A Metric for Detecting Adversarial Perturbations During Inference in Neural Network Classifiers

📅 2025-11-14

📈 Citations: 0

✨ Influential: 0

career value

196K/year

🤖 AI Summary

This work addresses adversarial perturbation detection during real-time inference under label-free conditions. We propose Volatility in Certainty (VC), a novel, label-agnostic confidence anomaly metric that quantifies local degradation in output smoothness by computing the mean squared logarithmic ratio of adjacent confidence scores in the sorted softmax output. VC is architecture-agnostic and computationally lightweight, enabling the first unsupervised, online monitoring of adversarial drift. Experiments on MNIST and CIFAR-10 demonstrate that log(VC) exhibits strong negative correlation with classification accuracy (Spearman ρ < −0.90), effectively serving as an early warning signal for model performance degradation. The method thus supports timely defensive interventions in safety-critical systems without requiring ground-truth labels or model retraining.

Technology Category

Application Category

📝 Abstract

Adversarial robustness remains a critical challenge in deploying neural network classifiers, particularly in real-time systems where ground-truth labels are unavailable during inference. This paper investigates extit{Volatility in Certainty} (VC), a recently proposed, label-free metric that quantifies irregularities in model confidence by measuring the dispersion of sorted softmax outputs. Specifically, VC is defined as the average squared log-ratio of adjacent certainty values, capturing local fluctuations in model output smoothness. We evaluate VC as a proxy for classification accuracy and as an indicator of adversarial drift. Experiments are conducted on artificial neural networks (ANNs) and convolutional neural networks (CNNs) trained on MNIST, as well as a regularized VGG-like model trained on CIFAR-10. Adversarial examples are generated using the Fast Gradient Sign Method (FGSM) across varying perturbation magnitudes. In addition, mixed test sets are created by gradually introducing adversarial contamination to assess VC's sensitivity under incremental distribution shifts. Our results reveal a strong negative correlation between classification accuracy and log(VC) (correlation rho < -0.90 in most cases), suggesting that VC effectively reflects performance degradation without requiring labeled data. These findings position VC as a scalable, architecture-agnostic, and real-time performance metric suitable for early-warning systems in safety-critical applications.

Problem

Research questions and friction points this paper is trying to address.

Detecting adversarial perturbations in neural networks during inference without ground-truth labels

Measuring model confidence irregularities using dispersion of sorted softmax outputs

Providing real-time performance monitoring for safety-critical AI applications

Innovation

Methods, ideas, or system contributions that make the work stand out.

Volatility in Certainty metric measures softmax dispersion

VC captures local fluctuations in model output smoothness

Architecture-agnostic real-time performance metric for safety

🔎 Similar Papers

No similar papers found.