🤖 AI Summary
This paper addresses three key challenges in sentence-pair conflict detection within software requirements documents: severe class imbalance, insufficient semantic representation capability of single-encoder models, and poor cross-domain transfer performance. To tackle these, we propose a dual-encoder collaborative transfer learning framework. Methodologically, we integrate SBERT and SimCSE into a dual-encoder architecture and enhance sentence-pair interaction modeling via a six-segment concatenation strategy. We design a hybrid loss function incorporating an improved Focal Loss, a confidence penalty term, and domain-specific constraints. Additionally, we introduce a sequential cross-domain collaborative transfer mechanism to improve generalization. Experimental results demonstrate substantial gains: on in-domain evaluation, macro-F1 and weighted-F1 both increase by 10.4%; under cross-domain settings, macro-F1 improves by 11.4%, significantly outperforming state-of-the-art approaches.
📝 Abstract
Software Requirement Document (RD) typically contain tens of thousands of individual requirements, and ensuring consistency among these requirements is critical for the success of software engineering projects. Automated detection methods can significantly enhance efficiency and reduce costs; however, existing approaches still face several challenges, including low detection accuracy on imbalanced data, limited semantic extraction due to the use of a single encoder, and suboptimal performance in cross-domain transfer learning. To address these issues, this paper proposes a Transferable Software Requirement Conflict Detection Framework based on SBERT and SimCSE, termed TSRCDF-SS. First, the framework employs two independent encoders, Sentence-BERT (SBERT) and Simple Contrastive Sentence Embedding (SimCSE), to generate sentence embeddings for requirement pairs, followed by a six-element concatenation strategy. Furthermore, the classifier is enhanced by a two-layer fully connected feedforward neural network (FFNN) with a hybrid loss optimization strategy that integrates a variant of Focal Loss, domain-specific constraints, and a confidence-based penalty term. Finally, the framework synergistically integrates sequential and cross-domain transfer learning. Experimental results demonstrate that the proposed framework achieves a 10.4% improvement in both macro-F1 and weighted-F1 scores in in-domain settings, and an 11.4% increase in macro-F1 in cross-domain scenarios.