Identifying Flaky Tests in Quantum Code: A Machine Learning Approach

📅 2025-02-06

📈 Citations: 0

✨ Influential: 0

career value

215K/year

🤖 AI Summary

Quantum programs exhibit inherent flakiness—unreproducible test outcomes—due to quantum-specific features such as superposition and entanglement; however, existing research lacks systematic causal analysis and effective detection methods. Method: This paper presents the first systematic investigation into the root causes of quantum flakiness and introduces the first machine learning–based identification platform for quantum software testing. We propose a supervised learning framework integrating both static and dynamic program features, and comparatively evaluate five models—including XGBoost and decision trees—under balanced and imbalanced data settings. Contribution/Results: Our evaluation shows XGBoost achieves optimal F1 and Matthews Correlation Coefficient (MCC) on balanced data, while decision trees excel on imbalanced data. We extend and publicly release the first fully annotated quantum flaky test dataset. Experimental results demonstrate significant improvements in flakiness identification performance, establishing foundational theory, practical tooling, and benchmark resources for quantum software reliability research.

Technology Category

Application Category

📝 Abstract

Testing and debugging quantum software pose significant challenges due to the inherent complexities of quantum mechanics, such as superposition and entanglement. One challenge is indeterminacy, a fundamental characteristic of quantum systems, which increases the likelihood of flaky tests in quantum programs. To the best of our knowledge, there is a lack of comprehensive studies on quantum flakiness in the existing literature. In this paper, we present a novel machine learning platform that leverages multiple machine learning models to automatically detect flaky tests in quantum programs. Our evaluation shows that the extreme gradient boosting and decision tree-based models outperform other models (i.e., random forest, k-nearest neighbors, and support vector machine), achieving the highest F1 score and Matthews Correlation Coefficient in a balanced dataset and an imbalanced dataset, respectively. Furthermore, we expand the currently limited dataset for researchers interested in quantum flaky tests. In the future, we plan to explore the development of unsupervised learning techniques to detect and classify quantum flaky tests more effectively. These advancements aim to improve the reliability and robustness of quantum software testing.

Problem

Research questions and friction points this paper is trying to address.

Detect flaky tests in quantum code

Use machine learning for quantum software

Improve quantum software testing reliability

Innovation

Methods, ideas, or system contributions that make the work stand out.

Machine learning detects quantum flaky tests

Extreme gradient boosting outperforms other models

Expands dataset for quantum flaky test research

🔎 Similar Papers

No similar papers found.