Can ChatGPT Generate Realistic Synthetic System Requirement Specifications? Results of a Case Study

📅 2026-03-10

📈 Citations: 0

✨ Influential: 0

career value

173K/year

🤖 AI Summary

This study addresses the scarcity of real-world system requirements specifications (SyRSs), which are often inaccessible due to confidentiality and intellectual property constraints, thereby hindering research progress. To overcome this limitation, the authors propose a novel paradigm for generating high-fidelity synthetic requirements without access to authentic samples. By integrating systematic prompt engineering with cross-model validation using black-box large language models such as ChatGPT, they produce 300 synthetic SyRSs spanning ten distinct industries. The generated artifacts undergo iterative refinement through LLM-driven quality assessment and expert surveys. Experimental evaluation reveals that 62% of domain experts deem the synthetic requirements realistic; however, in-depth analysis uncovers persistent logical inconsistencies and defects, underscoring that current LLMs cannot yet fully replace human review in requirements validation.

Technology Category

Application Category

📝 Abstract

System requirement specifications (SyRSs) are central, natural-language (NL) artifacts. Access to real SyRS for research purposes is highly valuable but limited by proprietary restrictions or confidentiality concerns. Generating synthetic SyRSs (SSyRSs) can address this scarcity. Black-box large language models (LLMs) such as ChatGPT offer compelling generation capabilities by providing easy access to NL generation functions without requiring access to real data. However, LLMs suffer from hallucinations and overconfidence, which pose major challenges in their use. We designed an exploratory study to investigate whether, despite these challenges, we can generate realistic SSyRSs with ChatGPT without having access to real SyRSs. Using a systematic approach that leverages prompt patterns, LLM-based quality assessments, and iterative prompt refinements, we generated 300 SSyRSs across 10 industries with ChatGPT. The results were evaluated using cross-model checks and an expert study, with n=87 submitted surveys. 62\% of experts considered the SSyRSs to be realistic. However, in-depth examination revealed contradictory statements and deficiencies. Overall, we were able to generate realistic SSyRSs to a certain extent with ChatGPT, but LLM-based quality assessments cannot fully replace thorough expert evaluations. This paper presents the methodology and results of our study and discusses the key insights we obtained.

Problem

Research questions and friction points this paper is trying to address.

system requirement specifications

synthetic data generation

large language models

ChatGPT

realism evaluation

Innovation

Methods, ideas, or system contributions that make the work stand out.

synthetic system requirements

ChatGPT

prompt engineering