A Dataset of Performance Measurements and Alerts from Mozilla (Data Artifact)

📅 2025-03-20
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Performance regression detection is hindered by the scarcity of high-quality, expert-validated real-world datasets. To address this, we introduce the first open-source browser performance benchmark dataset, covering Firefox releases from May 2023 to May 2024. It comprises 5,655 time-series performance metrics, 17,989 manually verified alerts, and associated Bugzilla defect metadata. Our methodology integrates industrial-scale time-series data with expert-annotated alerts and root-cause labels—a first-of-its-kind systematic curation for high-performance software anomaly detection. Data were collected from Mozilla’s CI infrastructure and rigorously validated through multi-stage human review and cross-referencing with defect tracking systems. The dataset is publicly available on Zenodo (DOI: 10.5281/zenodo.14642238), enabling reproducible evaluation and advancement of performance degradation detection and regression attribution models.

Technology Category

Application Category

📝 Abstract
Performance regressions in software systems can lead to significant financial losses and degraded user satisfaction, making their early detection and mitigation critical. Despite the importance of practices that capture performance regressions early, there is a lack of publicly available datasets that comprehensively capture real-world performance measurements, expert-validated alerts, and associated metadata such as bugs and testing conditions. To address this gap, we introduce a unique dataset to support various research studies in performance engineering, anomaly detection, and machine learning. This dataset was collected from Mozilla Firefox's performance testing infrastructure and comprises 5,655 performance time series, 17,989 performance alerts, and detailed annotations of resulting bugs collected from May 2023 to May 2024. By publishing this dataset, we provide researchers with an invaluable resource for studying performance trends, developing novel change point detection methods, and advancing performance regression analysis across diverse platforms and testing environments. The dataset is available at https://doi.org/10.5281/zenodo.14642238
Problem

Research questions and friction points this paper is trying to address.

Lack of public datasets for performance regression analysis
Need for early detection of software performance regressions
Dataset supports research in anomaly detection and machine learning
Innovation

Methods, ideas, or system contributions that make the work stand out.

Dataset from Mozilla Firefox performance testing
Includes 5,655 performance time series
Contains 17,989 expert-validated performance alerts
🔎 Similar Papers
No similar papers found.
M
Mohamed Bilel Besbes
REALISE Lab @ Concordia University, Montréal, Québec, Canada
Diego Elias Costa
Diego Elias Costa
Assistant Professor, Concordia University
Software EngineeringSoftware EcosystemsPerformance EngineeringSE4AI
Suhaib Mujahid
Suhaib Mujahid
Mozilla
Software EngineeringSoftware EcosystemsMining Software RepositoriesEmpirical Software Engineering
G
Gregory Mierzwinski
Mozilla, Potton, Québec, Canada
M
Marco Castelluccio
Mozilla, London, United Kingdom