🤖 AI Summary
This study addresses the fragility of locators in GUI automation testing, where structural changes in applications frequently cause locator failures and false positives, yet large-scale empirical evaluation is hindered by the absence of publicly available datasets. By mining 359 open-source repositories, we identify commits involving locator modifications and successfully reproduce 449 verifiable locator breakages across four projects with high change frequency. We present ReproBreak, the first publicly released, reproducible dataset of locator failures targeting Cypress and Playwright, thereby filling a critical data gap in GUI test robustness research. This resource provides a high-quality benchmark for advancing studies on locator fragility analysis, repair techniques, and overall test resilience.
📝 Abstract
Automated GUI testing frameworks such as Cypress and Playwright rely on locators to find and interact with web elements. A locator break occurs when a structural change in the application under test causes a locator to no longer find its target element, resulting in test breakages even when the underlying functionality remains unchanged. Despite its impact on test maintenance, no dataset exists to evaluate locator fragility in Cypress and Playwright at scale. In this paper, we present ReproBreak, a dataset of reproducible locator breaks in web application GUI tests. We analyzed 359 open-source repositories to identify commits that contain locator changes. To confirm whether these changes are indeed locator breaks, we reproduced them in the top 4 projects with the largest number of locator changes and found 449 locator breaks, which are provided in the dataset along with scripts for automated reproduction. We believe ReproBreak serves as a valuable artifact to support research on locator fragility, repair techniques, and test robustness. The video is available at: https://youtu.be/mZByS_TnCvE. The dataset is at https://github.com/rub-sq/ReproBreak.