🤖 AI Summary
This work addresses the lack of effective test coverage criteria for quantum programs, which hinders the assessment of test adequacy. It pioneers the adaptation of classical software testing criteria—condition, decision, and path coverage—to circuit-based quantum programs and introduces novel probabilistic variants that integrate structural coverage with statistical confidence measures. The study reveals that multi-controlled gates induce path explosion, severely limiting path coverage. Leveraging circuit analysis, coverage modeling, probabilistic statistics, and mutation testing, the authors implement a tool, QaCoCo, and evaluate it on 540 quantum circuits. Results show condition and decision coverages exceed 97%, while path coverage reaches only 71.84%. Corresponding probabilistic confidence levels are 88.87%, 88.65%, and 37.18%, respectively, with weak correlation observed between structural coverage and fault detection effectiveness.
📝 Abstract
Coverage criteria play a central role in assessing test adequacy in classical software, yet their effectiveness for quantum programs remains poorly understood and largely unexplored. In this paper, we propose six quantum-tailored criteria - condition, decision, and path coverage, and their probabilistic variants - adapted from their classical counterparts. We present QaCoCo, a tool that computes these criteria for circuit-based quantum programs. We empirically evaluate these criteria on a large and diverse set of 540 circuits and analyze the coverage achieved. Our results show that while circuits frequently achieve high condition and decision coverage (97.56% and 97.63%, on average), path coverage remains limited (71.84%), particularly in the presence of multi-controlled gates, which induce extreme path explosion and coverage imbalance. Moreover, to account for the probabilistic nature of quantum circuits, we introduce probabilistic coverage, which augments structural coverage with a confidence measure (88.87%, 88.65%, and 37.18% for condition, decision, and path coverage, respectively, on average). Finally, through mutation testing, we find weak or no correlation between fault detection and structural coverage, consistent with observations in classical computing.