The 6th International Verification of Neural Networks Competition (VNN-COMP 2025): Summary and Results

📅 2025-12-21

📈 Citations: 0

✨ Influential: 0

career value

221K/year

🤖 AI Summary

Neural network verification tools lack fair, reproducible evaluation standards. Method: This work initiates and organizes the first international Verification of Neural Networks Competition (VNN-COMP), establishing a rigorous benchmarking framework featuring standardized ONNX model representation, an extended VNN-LIB specification supporting multi-class properties, preset parameter locking, and an automated AWS-based evaluation pipeline. Eight participating teams are systematically evaluated across 16 standard and 9 extended benchmarks. Contribution/Results: The competition introduces a fully transparent, pre-specified parameter strategy and a parallel multi-tool verification framework, significantly enhancing result reproducibility and cross-tool comparability. It produces an authoritative performance leaderboard, explicitly characterizing the capabilities and applicability domains of three major tool categories. Furthermore, it establishes a community-endorsed benchmark suite, thereby advancing standardization of verification tool interfaces and fostering collaborative tool evolution.

Technology Category

Application Category

📝 Abstract

This report summarizes the 6th International Verification of Neural Networks Competition (VNN-COMP 2025), held as a part of the 8th International Symposium on AI Verification (SAIV), that was collocated with the 37th International Conference on Computer-Aided Verification (CAV). VNN-COMP is held annually to facilitate the fair and objective comparison of state-of-the-art neural network verification tools, encourage the standardization of tool interfaces, and bring together the neural network verification community. To this end, standardized formats for networks (ONNX) and specification (VNN-LIB) were defined, tools were evaluated on equal-cost hardware (using an automatic evaluation pipeline based on AWS instances), and tool parameters were chosen by the participants before the final test sets were made public. In the 2025 iteration, 8 teams participated on a diverse set of 16 regular and 9 extended benchmarks. This report summarizes the rules, benchmarks, participating tools, results, and lessons learned from this iteration of this competition.

Problem

Research questions and friction points this paper is trying to address.

Compare neural network verification tools objectively

Standardize tool interfaces and formats for verification

Evaluate tools on diverse benchmarks using equal hardware

Innovation

Methods, ideas, or system contributions that make the work stand out.

Standardized ONNX and VNN-LIB formats

Equal-cost AWS hardware evaluation pipeline

Pre-public parameter selection for tools

🔎 Similar Papers

ModelVerification.jl: a Comprehensive Toolbox for Formally Verifying Deep Neural Networks