A transfer learning approach for automatic conflicts detection in software requirement sentence pairs based on dual encoders

📅 2025-11-28
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This paper addresses three key challenges in sentence-pair conflict detection within software requirements documents: severe class imbalance, insufficient semantic representation capability of single-encoder models, and poor cross-domain transfer performance. To tackle these, we propose a dual-encoder collaborative transfer learning framework. Methodologically, we integrate SBERT and SimCSE into a dual-encoder architecture and enhance sentence-pair interaction modeling via a six-segment concatenation strategy. We design a hybrid loss function incorporating an improved Focal Loss, a confidence penalty term, and domain-specific constraints. Additionally, we introduce a sequential cross-domain collaborative transfer mechanism to improve generalization. Experimental results demonstrate substantial gains: on in-domain evaluation, macro-F1 and weighted-F1 both increase by 10.4%; under cross-domain settings, macro-F1 improves by 11.4%, significantly outperforming state-of-the-art approaches.

Technology Category

Application Category

📝 Abstract
Software Requirement Document (RD) typically contain tens of thousands of individual requirements, and ensuring consistency among these requirements is critical for the success of software engineering projects. Automated detection methods can significantly enhance efficiency and reduce costs; however, existing approaches still face several challenges, including low detection accuracy on imbalanced data, limited semantic extraction due to the use of a single encoder, and suboptimal performance in cross-domain transfer learning. To address these issues, this paper proposes a Transferable Software Requirement Conflict Detection Framework based on SBERT and SimCSE, termed TSRCDF-SS. First, the framework employs two independent encoders, Sentence-BERT (SBERT) and Simple Contrastive Sentence Embedding (SimCSE), to generate sentence embeddings for requirement pairs, followed by a six-element concatenation strategy. Furthermore, the classifier is enhanced by a two-layer fully connected feedforward neural network (FFNN) with a hybrid loss optimization strategy that integrates a variant of Focal Loss, domain-specific constraints, and a confidence-based penalty term. Finally, the framework synergistically integrates sequential and cross-domain transfer learning. Experimental results demonstrate that the proposed framework achieves a 10.4% improvement in both macro-F1 and weighted-F1 scores in in-domain settings, and an 11.4% increase in macro-F1 in cross-domain scenarios.
Problem

Research questions and friction points this paper is trying to address.

Detects conflicts in software requirement pairs automatically
Improves accuracy on imbalanced data with dual encoders
Enhances cross-domain transfer learning for consistency checks
Innovation

Methods, ideas, or system contributions that make the work stand out.

Uses dual SBERT and SimCSE encoders for embeddings
Employs hybrid loss with Focal Loss variant and penalties
Integrates sequential and cross-domain transfer learning
Yizheng Wang
Yizheng Wang
Stanford University
Climate ChangeSequential Decision MakingPOMDPsBayesian Statistics
T
Tao Jiang
School of Mathematics and Computer Science, Yunnan Minzu University, Kunming, China
J
Jinyan Bai
School of Mathematics and Computer Science, Yunnan Minzu University, Kunming, China
Z
Zhengbin Zou
School of Mathematics and Computer Science, Yunnan Minzu University, Kunming, China
T
Tiancheng Xue
School of Mathematics and Computer Science, Yunnan Minzu University, Kunming, China
N
Nan Zhang
School of Mathematics and Computer Science, Yunnan Minzu University, Kunming, China
J
Jie Luan
School of Mathematics and Computer Science, Yunnan Minzu University, Kunming, China