🤖 AI Summary
Performance regression detection is hindered by the scarcity of high-quality, expert-validated real-world datasets. To address this, we introduce the first open-source browser performance benchmark dataset, covering Firefox releases from May 2023 to May 2024. It comprises 5,655 time-series performance metrics, 17,989 manually verified alerts, and associated Bugzilla defect metadata. Our methodology integrates industrial-scale time-series data with expert-annotated alerts and root-cause labels—a first-of-its-kind systematic curation for high-performance software anomaly detection. Data were collected from Mozilla’s CI infrastructure and rigorously validated through multi-stage human review and cross-referencing with defect tracking systems. The dataset is publicly available on Zenodo (DOI: 10.5281/zenodo.14642238), enabling reproducible evaluation and advancement of performance degradation detection and regression attribution models.
📝 Abstract
Performance regressions in software systems can lead to significant financial losses and degraded user satisfaction, making their early detection and mitigation critical. Despite the importance of practices that capture performance regressions early, there is a lack of publicly available datasets that comprehensively capture real-world performance measurements, expert-validated alerts, and associated metadata such as bugs and testing conditions. To address this gap, we introduce a unique dataset to support various research studies in performance engineering, anomaly detection, and machine learning. This dataset was collected from Mozilla Firefox's performance testing infrastructure and comprises 5,655 performance time series, 17,989 performance alerts, and detailed annotations of resulting bugs collected from May 2023 to May 2024. By publishing this dataset, we provide researchers with an invaluable resource for studying performance trends, developing novel change point detection methods, and advancing performance regression analysis across diverse platforms and testing environments. The dataset is available at https://doi.org/10.5281/zenodo.14642238