Do Unit Proofs Work? An Empirical Study of Compositional Bounded Model Checking for Memory Safety Verification

📅 2025-03-17

📈 Citations: 0

✨ Influential: 0

career value

205K/year

🤖 AI Summary

This paper addresses the lack of systematicity, consistency, and empirical evaluation in unit proofs for memory-safety verification of embedded systems—a previously unexplored challenge. We conduct the first empirical study, proposing a systematic, feedback-driven methodology for constructing unit proofs grounded in objective criteria. Our approach integrates compositional bounded model checking (BMC), formal verification, and coverage-guided construction, yielding 73 unit proofs across four embedded operating systems. Results show average proof development and execution times of 87 and 61 minutes, respectively; detection of 74% of known defects—with an additional 9% uncovered after increasing the BMC bound—and discovery of 19 previously unknown defects. Crucially, this work provides the first empirical characterization of unit-proof applicability in embedded software, establishing the inaugural benchmark for their effectiveness, cost-efficiency, and generalizability.

Technology Category

Application Category

📝 Abstract

Memory safety defects pose a major threat to software reliability, enabling cyberattacks, outages, and crashes. To mitigate these risks, organizations adopt Compositional Bounded Model Checking (BMC), using unit proofs to formally verify memory safety. However, methods for creating unit proofs vary across organizations and are inconsistent within the same project, leading to errors and missed defects. In addition, unit proofing remains understudied, with no systematic development methods or empirical evaluations. This work presents the first empirical study on unit proofing for memory safety verification. We introduce a systematic method for creating unit proofs that leverages verification feedback and objective criteria. Using this approach, we develop 73 unit proofs for four embedded operating systems and evaluate their effectiveness, characteristics, cost, and generalizability. Our results show unit proofs are cost-effective, detecting 74% of recreated defects, with an additional 9% found with increased BMC bounds, and 19 new defects exposed. We also found that embedded software requires small unit proofs, which can be developed in 87 minutes and executed in 61 minutes on average. These findings provide practical guidance for engineers and empirical data to inform tooling design.

Problem

Research questions and friction points this paper is trying to address.

Evaluates effectiveness of unit proofs in memory safety verification.

Introduces systematic method for creating consistent unit proofs.

Assesses cost, generalizability, and defect detection of unit proofs.

Innovation

Methods, ideas, or system contributions that make the work stand out.

Systematic method for creating unit proofs

Leverages verification feedback and objective criteria

Empirical evaluation of 73 unit proofs effectiveness

🔎 Similar Papers

No similar papers found.