Automated Concern Extraction from Textual Requirements of Cyber-Physical Systems: A Multi-solution Study

📅 2025-10-22
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Existing automated concern extraction methods for Cyber-Physical Systems (CPS) requirements lack a fair, comprehensive, and domain-diverse benchmark. Method: We introduce ReqEBench—the first dedicated benchmark for requirement-level concern extraction—comprising 2,721 real-world requirements from 12 CPS systems across aerospace, healthcare, and other domains, with multi-dimensional, high-quality human annotations aligned to realistic scenarios, cross-domain coverage, rigorous annotation protocols, and fine-grained concern categorization. Contribution/Results: Using ReqEBench, we systematically evaluate rule-based approaches, traditional machine learning, and large language models (e.g., GPT-4), revealing that the best-performing model achieves only 0.24 F1-score on entity-level concern identification—highlighting a critical performance gap. Root-cause analysis identifies key failure modes. ReqEBench is publicly released to support reproducible, scalable evaluation in automated requirements engineering research.

Technology Category

Application Category

📝 Abstract
Cyber-physical systems (CPSs) are characterized by a deep integration of the information space and the physical world, which makes the extraction of requirements concerns more challenging. Some automated solutions for requirements concern extraction have been proposed to alleviate the burden on requirements engineers. However, evaluating the effectiveness of these solutions, which relies on fair and comprehensive benchmarks, remains an open question. To address this gap, we propose ReqEBench, a new CPSs requirements concern extraction benchmark, which contains 2,721 requirements from 12 real-world CPSs. ReqEBench offers four advantages. It aligns with real-world CPSs requirements in multiple dimensions, e.g., scale and complexity. It covers comprehensive concerns related to CPSs requirements. It undergoes a rigorous annotation process. It covers multiple application domains of CPSs, e.g., aerospace and healthcare. We conducted a comparative study on three types of automated requirements concern extraction solutions and revealed their performance in real-world CPSs using our ReqEBench. We found that the highest F1 score of GPT-4 is only 0.24 in entity concern extraction. We further analyze failure cases of popular LLM-based solutions, summarize their shortcomings, and provide ideas for improving their capabilities. We believe ReqEBench will facilitate the evaluation and development of automated requirements concern extraction.
Problem

Research questions and friction points this paper is trying to address.

Automated extraction of concerns from cyber-physical system requirements
Evaluating effectiveness of existing concern extraction solutions
Benchmark development for comprehensive CPS requirements analysis
Innovation

Methods, ideas, or system contributions that make the work stand out.

Proposed ReqEBench benchmark for CPS requirements
Evaluated three automated concern extraction solutions
Analyzed LLM failures to suggest improvements
🔎 Similar Papers
No similar papers found.
Dongming Jin
Dongming Jin
Peking University
Requirments EngineeringLarge Language Models
Zhi Jin
Zhi Jin
Sun Yat-Sen University, Associate Professor
X
Xiaohong Chen
East China Normal University, China
Z
Zheng Fang
Peking University, China
Linyu Li
Linyu Li
Peking University
knowledge graphai4science
S
Shengxin Zhao
Inner Mongolia Normal University, China
C
Chunhui Wang
Inner Mongolia Normal University, China
H
Hongbin Xiao
Guangxi Normal University, China