Testing Neural Network Verifiers: A Soundness Benchmark with Hidden Counterexamples

📅 2024-12-04

🏛️ arXiv.org

📈 Citations: 1

✨ Influential: 0

career value

171K/year

🤖 AI Summary

Existing neural network verifiers lack “source-grounded ground truth”—i.e., labels for hard instances that are truly unverifiable and resistant to counterexample discovery—making it difficult to validate claims of solving challenging cases. Method: We introduce the first verifier benchmark designed for source-grounded evaluation, proposing implicit counterexample injection training to embed hidden, semantically valid counterexamples imperceptible to standard adversarial attacks. Our framework enables controllable generation of hard-to-verify instances across architectures, activation functions, input dimensions, and perturbation radii. By integrating adversarial robust training, gradient masking mitigation, and multi-dimensional parametric synthesis, we systematically construct instances exposing fundamental soundness flaws. Contribution/Results: The benchmark uncovers both real and synthetic soundness bugs in multiple state-of-the-art verifiers. All code and datasets are publicly released, establishing a standardized foundation for rigorous, reproducible verifier reliability assessment.

Technology Category

Application Category

📝 Abstract

In recent years, many neural network (NN) verifiers have been developed to formally verify certain properties of neural networks such as robustness. Although many benchmarks have been constructed to evaluate the performance of NN verifiers, they typically lack a ground-truth for hard instances where no current verifier can verify and no counterexample can be found, which makes it difficult to check the soundness of a new verifier if it claims to verify hard instances which no other verifier can do. We propose to develop a soundness benchmark for NN verification. Our benchmark contains instances with deliberately inserted counterexamples while we also try to hide the counterexamples from regular adversarial attacks which can be used for finding counterexamples. We design a training method to produce neural networks with such hidden counterexamples. Our benchmark aims to be used for testing the soundness of NN verifiers and identifying falsely claimed verifiability when it is known that hidden counterexamples exist. We systematically construct our benchmark and generate instances across diverse model architectures, activation functions, input sizes, and perturbation radii. We demonstrate that our benchmark successfully identifies bugs in state-of-the-art NN verifiers, as well as synthetic bugs, providing a crucial step toward enhancing the reliability of testing NN verifiers. Our code is available at https://github.com/MVP-Harry/SoundnessBench and our benchmark is available at https://huggingface.co/datasets/SoundnessBench/SoundnessBench.

Problem

Research questions and friction points this paper is trying to address.

Testing soundness of neural network verifiers

Identifying false verification claims in benchmarks

Creating hidden counterexamples for verifier evaluation

Innovation

Methods, ideas, or system contributions that make the work stand out.

Benchmark with hidden counterexamples for verifier testing

Training method produces deliberately hidden counterexamples

Systematic construction across diverse model architectures

🔎 Similar Papers

No similar papers found.