🤖 AI Summary
The C++ ecosystem lacks standardized benchmark datasets for empirically evaluating tools that infer or verify formal specifications.
Method: FormalSpecCpp introduces the first open-source, C++-specific formal specification benchmark, comprising numerous programs annotated with rigorously defined preconditions and postconditions, fully compliant with ISO C++ syntax and contract-based programming idioms. Specifications are generated via large language models (LLMs), then manually verified and systematically annotated to ensure semantic fidelity and syntactic consistency.
Contribution/Results: This dataset fills a critical gap in empirical research on formal methods for C++, enabling reproducible, scalable evaluation of specification inference tools, program verification algorithms, and LLMs’ capabilities in formal software development—including fine-tuning, generalization, and specification synthesis. By providing a rigorous, community-accessible standard, FormalSpecCpp significantly enhances methodological rigor and cross-study comparability in this domain.
📝 Abstract
FormalSpecCpp is a dataset designed to fill the gap in standardized benchmarks for verifying formal specifications in C++ programs. To the best of our knowledge, this is the first comprehensive collection of C++ programs with well-defined preconditions and postconditions. It provides a structured benchmark for evaluating specification inference tools and testing theaccuracy of generated specifications. Researchers and developers can use this dataset to benchmark specification inference tools,fine-tune Large Language Models (LLMs) for automated specification generation, and analyze the role of formal specifications in improving program verification and automated testing. By making this dataset publicly available, we aim to advance research in program verification, specification inference, and AI-assisted software development. The dataset and the code are available at https://github.com/MadhuNimmo/FormalSpecCpp.