Targeted Test Selection Approach in Continuous Integration

๐Ÿ“… 2025-09-12
๐Ÿ“ˆ Citations: 0
โœจ Influential: 0
๐Ÿ“„ PDF
๐Ÿค– AI Summary
To address inefficient test selection in continuous integration (CI) caused by growing code and test suite sizes, this paper proposes a coverage-agnostic, change-driven test selection method. Our approach models changed files as bag-of-words representations and integrates cross-file structural features with defect-proneness indicators to train a lightweight, scalable machine learning model. Crucially, it eliminates reliance on historical code coverage data, enabling end-to-end, commit-based test recommendation. Evaluated on a real-world industrial dataset, the method executes only 15% of the test suite, reduces test execution time by 5.9ร—, and accelerates the overall CI pipeline by 5.6ร—, while maintaining a 95.2% failure-detection rate. This significantly improves developer feedback latency and defect interception capability.

Technology Category

Application Category

๐Ÿ“ Abstract
In modern software development change-based testing plays a crucial role. However, as codebases expand and test suites grow, efficiently managing the testing process becomes increasingly challenging, especially given the high frequency of daily code commits. We propose Targeted Test Selection (T-TS), a machine learning approach for industrial test selection. Our key innovation is a data representation that represent commits as Bags-of-Words of changed files, incorporates cross-file and additional predictive features, and notably avoids the use of coverage maps. Deployed in production, T-TS was comprehensively evaluated against industry standards and recent methods using both internal and public datasets, measuring time efficiency and fault detection. On live industrial data, T-TS selects only 15% of tests, reduces execution time by $5.9 imes$, accelerates the pipeline by $5.6 imes$, and detects over 95% of test failures. The implementation is publicly available to support further research and practical adoption.
Problem

Research questions and friction points this paper is trying to address.

Selects minimal tests for code changes efficiently
Reduces test execution time in continuous integration
Maintains high fault detection without coverage maps
Innovation

Methods, ideas, or system contributions that make the work stand out.

Machine learning approach for test selection
Bags-of-Words representation for commits
Avoids coverage maps with predictive features
๐Ÿ”Ž Similar Papers
2024-03-24arXiv.orgCitations: 0
P
Pavel Plyusnin
T-Technologies
A
Aleksey Antonov
T-Technologies
V
Vasilii Ermakov
T-Technologies
A
Aleksandr Khaybriev
T-Technologies
M
Margarita Kikot
T-Technologies
Ilseyar Alimova
Ilseyar Alimova
ะšะคะฃ, ะ’ั‹ััˆะฐั ัˆะบะพะปะฐ ะ˜ะขะ˜ะก
Stanislav Moiseev
Stanislav Moiseev
T-Technologies
computer scienceAImathematics