How Developers Adopt, Use, and Evolve CI/CD Caching: An Empirical Study on GitHub Actions

📅 2026-04-13
📈 Citations: 0
Influential: 0
📄 PDF

career value

199K/year
🤖 AI Summary
This study addresses the lack of systematic understanding in the configuration and maintenance of CI/CD caching, which imposes a significant burden on developers despite its benefits for build efficiency. Through a large-scale empirical analysis of 952 repositories on GitHub Actions—encompassing 1,556 workflow files and over ten thousand cache-related changes—the authors employ code mining, configuration analysis, commit tracing, and statistical modeling to uncover real-world caching practices, evolutionary patterns, and human-bot collaboration in maintenance. The findings reveal that cache adopters are more active, caching strategies are diverse and frequently adjusted, and build- and test-related tasks evolve rapidly. Manual interventions primarily address misconfigurations, whereas version upgrades are predominantly automated by bots. The work quantifies the maintenance overhead of caching and provides empirical foundations for improving developer tooling.

Technology Category

Application Category

📝 Abstract
Continuous Integration/Continuous Delivery (CI/CD) caching is widely used to reduce repeated computation and improve CI/CD efficiency, yet maintaining effective caching requires ongoing maintenance effort. In this paper, we present the first empirical study on how developers configure and evolve caching in CI/CD workflows on GitHub Actions. We analyze 952 GitHub repositories (266 cache adopters and 686 non-adopters), to compare repository characteristics, characterize caching usage at the job and step levels, uncover patterns in caching configuration evolution, and identify the drivers of cache-related changes. Our analysis spans 1,556 workflow files, 10,373 commits, and 17,185 workflow configuration changes, including an average of 9.37 cache-related changes per repository. Our main observations are: (1) cache-adopting repositories are more active and popular than non-adopters; (2) caching is used across multiple CI/CD job types through a variety of caching mechanisms rather than a single standardized approach; (3) caching configurations evolve through frequent, repetitive maintenance patterns, with rapid updates in build and test jobs and slower evolution in other job types; and (4) cache-related modifications are driven by distinct maintenance needs: parameter updates are mainly human-driven to fix issues, while version updates occur later and are often bot-driven for dependency maintenance. Our findings quantify the substantial maintenance effort involved in CI/CD caching and highlight opportunities to improve reliability and tool support.
Problem

Research questions and friction points this paper is trying to address.

CI/CD caching
GitHub Actions
cache maintenance
workflow evolution
empirical study
Innovation

Methods, ideas, or system contributions that make the work stand out.

CI/CD caching
empirical study
GitHub Actions
workflow evolution
maintenance effort