Natural Language-Programming Language Software Traceability Link Recovery Needs More than Textual Similarity

📅 2025-09-06

📈 Citations: 0

✨ Influential: 0

career value

182K/year

🤖 AI Summary

To address the limited performance of text-similarity-based trace link recovery (TLR) caused by the semantic gap between natural language (NL) and programming language (PL) artifacts, this paper proposes a cross-modal association method integrating multiple domain-specific auxiliary strategies. We systematically design a synergistic framework that unifies edge-type modeling, context enhancement, and prompt engineering—each tailored to augment heterogeneous graph transformer (HGT) and large language models (LLMs), specifically Gemini 2.5 Pro. Experimental evaluation across 12 open-source projects demonstrates that our multi-strategy HGT and Gemini 2.5 Pro achieve average F1-score improvements of 3.68% and 8.84% over respective baselines, significantly outperforming the current state-of-the-art HGNNLink. These results empirically validate the effectiveness of strategic synergy in bridging the NL-PL semantic gap for TLR.

Technology Category

Application Category

📝 Abstract

In the field of software traceability link recovery (TLR), textual similarity has long been regarded as the core criterion. However, in tasks involving natural language and programming language (NL-PL) artifacts, relying solely on textual similarity is limited by their semantic gap. To this end, we conducted a large-scale empirical evaluation across various types of TLR tasks, revealing the limitations of textual similarity in NL-PL scenarios. To address these limitations, we propose an approach that incorporates multiple domain-specific auxiliary strategies, identified through empirical analysis, into two models: the Heterogeneous Graph Transformer (HGT) via edge types and the prompt-based Gemini 2.5 Pro via additional input information. We then evaluated our approach using the widely studied requirements-to-code TLR task, a representative case of NL-PL TLR. Experimental results show that both the multi-strategy HGT and Gemini 2.5 Pro models outperformed their original counterparts without strategy integration. Furthermore, compared to the current state-of-the-art method HGNNLink, the multi-strategy HGT and Gemini 2.5 Pro models achieved average F1-score improvements of 3.68% and 8.84%, respectively, across twelve open-source projects, demonstrating the effectiveness of multi-strategy integration in enhancing overall model performance for the requirements-code TLR task.

Problem

Research questions and friction points this paper is trying to address.

Addressing semantic gap in NL-PL traceability link recovery

Overcoming limitations of textual similarity in software artifacts

Enhancing requirements-to-code traceability with multi-strategy integration

Innovation

Methods, ideas, or system contributions that make the work stand out.

Heterogeneous Graph Transformer with edge types

Prompt-based Gemini 2.5 Pro with additional inputs

Multi-strategy integration for enhanced traceability recovery

🔎 Similar Papers

No similar papers found.