Towards the Datasets Used in Requirements Engineering of Mobile Apps: Preliminary Findings from a Systematic Mapping Study

📅 2025-08-31
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This study investigates dataset usage patterns in empirical mobile application requirements engineering (RE) research to identify data source bias and its implications for external validity. Following Kitchenham et al.’s systematic mapping methodology, we analyze 43 empirical studies published between 2012 and 2023. Results reveal that over 90% rely exclusively on Google Play and Apple App Store—neglecting critical RE activities such as requirement validation and evolution—while exhibiting pronounced dataset homogeneity despite growing adoption. This work provides the first quantitative evidence of data source narrowing in mobile RE research. To mitigate this risk, we propose a “multi-source data fusion” framework advocating integration across platforms (e.g., F-Droid, GitHub), modalities (e.g., user reviews, source code, changelogs), and RE activities. The framework advances methodological rigor and practical relevance, supporting more generalizable and empirically grounded mobile RE research.

Technology Category

Application Category

📝 Abstract
[Background] Research on requirements engineering (RE) for mobile apps employs datasets formed by app users, developers or vendors. However, little is known about the sources of these datasets in terms of platforms and the RE activities that were researched with the help of the respective datasets. [Aims] The goal of this paper is to investigate the state-of-the-art of the datasets of mobile apps used in existing RE research. [Method] We carried out a systematic mapping study by following the guidelines of Kitchenham et al. [Results] Based on 43 selected papers, we found that Google Play and Apple App Store provide the datasets for more than 90% of published research in RE for mobile apps. We also found that the most investigated RE activities - based on datasets, are requirements elicitation and requirements analysis. [Conclusions] Our most important conclusions are: (1) there is a growth in the use of datasets for RE research of mobile apps since 2012, (2) the RE knowledge for mobile apps might be skewed due to the overuse of Google Play and Apple App Store, (3) there are attempts to supplement reviews of apps from repositories with other data sources, (4) there is a need to expand the alternative sources and experiments with complimentary use of multiple sources, if the community wants more generalizable results. Plus, it is expected to expand the research on other RE activities, beyond elicitation and analysis.
Problem

Research questions and friction points this paper is trying to address.

Identifying dataset sources used in mobile app requirements engineering research
Investigating RE activities studied through app store datasets like Google Play
Assessing potential bias in mobile RE knowledge from dominant data sources
Innovation

Methods, ideas, or system contributions that make the work stand out.

Systematic mapping study methodology
Analyzing Google Play and Apple datasets
Investigating requirements elicitation and analysis
🔎 Similar Papers