The Challenge of Generating and Evolving Real‐Life Like Synthetic Test Data Without Accessing Real‐World Raw Data—A Systematic Review

📅 2025-11-08
🏛️ Expert Syst. J. Knowl. Eng.
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This study addresses the critical gap in generating and evolving high-fidelity, privacy-preserving synthetic test data for sensitive domains such as e-government, where effective methods that do not require access to real original data remain scarce. It presents the first systematic literature review on synthetic test data generation and evolution techniques that operate without reliance on real data, following the Kitchenham protocol. From an initial pool of 1,013 publications retrieved from IEEE Xplore, ACM Digital Library, and Scopus, 75 studies were rigorously selected, revealing 37 approaches that partially meet the requirements, with nine emerging as the most promising candidates. The analysis further uncovers a pervasive limitation across existing methods: insufficient support for the continuous evolution of synthetic data, thereby identifying and filling a significant research void in this field.

Technology Category

Application Category

📝 Abstract
High‐level system testing of applications that use data from e‐Government services as input requires test data that is real‐life‐like but where the privacy of personal information is guaranteed. Applications with such strong requirement include information exchange between countries, medicine, banking, and so on. This review aims to synthesise the current state‐of‐the‐practice in this domain.
Problem

Research questions and friction points this paper is trying to address.

synthetic test data
data privacy
test data evolution
real-life-like data
systematic review
Innovation

Methods, ideas, or system contributions that make the work stand out.

synthetic test data
data privacy
systematic literature review
test data evolution
privacy-preserving
🔎 Similar Papers
No similar papers found.