Practical Pipeline-Aware Regression Test Optimization for Continuous Integration

📅 2025-01-20
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address test redundancy, high feedback latency, and inconsistent pre- vs. post-commit test selection objectives in large-scale multilingual monorepos, this paper proposes the first pipeline-aware, bi-objective reinforcement learning framework for regression test optimization: failure detection is prioritized during pre-commit testing, while flaky-change identification is emphasized post-commit. The method operates entirely on language-agnostic features, integrating pipeline semantic modeling with online log analysis to support dynamically evolving industrial test suites. Evaluated on 20 weeks of real-world CI data, it achieves significantly reduced average feedback latency, a 32% improvement in pre-commit test selection precision, and a 41% reduction in false positives—without requiring expensive features such as code coverage.

Technology Category

Application Category

📝 Abstract
Massive, multi-language, monolithic repositories form the backbone of many modern, complex software systems. To ensure consistent code quality while still allowing fast development cycles, Continuous Integration (CI) is commonly applied. However, operating CI at such scale not only leads to a single point of failure for many developers, but also requires computational resources that may reach feasibility limits and cause long feedback latencies. To address these issues, developers commonly split test executions across multiple pipelines, running small and fast tests in pre-submit stages while executing long-running and flaky tests in post-submit pipelines. Given the long runtimes of many pipelines and the substantial proportion of passing test executions (98% in our pre-submit pipelines), there not only a need but also potential for further improvements by prioritizing and selecting tests. However, many previously proposed regression optimization techniques are unfit for an industrial context, because they (1) rely on complex and difficult-to-obtain features like per-test code coverage that are not feasible in large, multi-language environments, (2) do not automatically adapt to rapidly changing systems where new tests are continuously added or modified, and (3) are not designed to distinguish the different objectives of pre- and post-submit pipelines: While pre-submit testing should prioritize failing tests, post-submit pipelines should prioritize tests that indicate non-flaky changes by transitioning from pass to fail outcomes or vice versa. To overcome these issues, we developed a lightweight and pipeline-aware regression test optimization approach that employs Reinforcement Learning models trained on language-agnostic features. We evaluated our approach on a large industry dataset collected over a span of 20 weeks of CI test executions. When predicting...
Problem

Research questions and friction points this paper is trying to address.

Continuous Integration
Feedback Lag
Test Optimization
Innovation

Methods, ideas, or system contributions that make the work stand out.

Reinforcement Learning
Continuous Integration
Adaptive Regression Testing
🔎 Similar Papers
2024-03-24arXiv.orgCitations: 0