Targeted Test Selection Approach in Continuous Integration

📅 2025-09-12

📈 Citations: 0

✨ Influential: 0

career value

192K/year

🤖 AI Summary

To address inefficient test selection in continuous integration (CI) caused by growing code and test suite sizes, this paper proposes a coverage-agnostic, change-driven test selection method. Our approach models changed files as bag-of-words representations and integrates cross-file structural features with defect-proneness indicators to train a lightweight, scalable machine learning model. Crucially, it eliminates reliance on historical code coverage data, enabling end-to-end, commit-based test recommendation. Evaluated on a real-world industrial dataset, the method executes only 15% of the test suite, reduces test execution time by 5.9×, and accelerates the overall CI pipeline by 5.6×, while maintaining a 95.2% failure-detection rate. This significantly improves developer feedback latency and defect interception capability.

Technology Category

Application Category

📝 Abstract

In modern software development change-based testing plays a crucial role. However, as codebases expand and test suites grow, efficiently managing the testing process becomes increasingly challenging, especially given the high frequency of daily code commits. We propose Targeted Test Selection (T-TS), a machine learning approach for industrial test selection. Our key innovation is a data representation that represent commits as Bags-of-Words of changed files, incorporates cross-file and additional predictive features, and notably avoids the use of coverage maps. Deployed in production, T-TS was comprehensively evaluated against industry standards and recent methods using both internal and public datasets, measuring time efficiency and fault detection. On live industrial data, T-TS selects only 15% of tests, reduces execution time by $5.9 imes$, accelerates the pipeline by $5.6 imes$, and detects over 95% of test failures. The implementation is publicly available to support further research and practical adoption.

Problem

Research questions and friction points this paper is trying to address.

Selects minimal tests for code changes efficiently

Reduces test execution time in continuous integration

Maintains high fault detection without coverage maps

Innovation

Methods, ideas, or system contributions that make the work stand out.

Machine learning approach for test selection

Bags-of-Words representation for commits

Avoids coverage maps with predictive features

🔎 Similar Papers

Fine-Grained Assertion-Based Test Selection