Co-Change Graph Entropy: A New Process Metric for Defect Prediction

📅 2025-04-25
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address insufficient accuracy in file-level software defect prediction, this paper proposes *Co-Change Graph Entropy*—a novel metric that models file co-change relationships as a weighted undirected graph and defines an information-theoretic entropy measure over the graph structure to quantify the dispersion of co-change behavior. Empirical analysis reveals a strong correlation between this metric and defect presence (|ρ| ≤ 0.54), and it complements traditional change entropy. Evaluated across 40 cross-project settings, integrating Co-Change Graph Entropy improves average AUROC by 82.5% and MCC by 65%. Friedman and Nemenyi tests confirm statistically significant performance gains (p < 0.05). This work introduces an interpretable, computationally tractable process metric for defect prediction, establishing a new paradigm for quantifying development process complexity.

Technology Category

Application Category

📝 Abstract
Process metrics, valued for their language independence and ease of collection, have been shown to outperform product metrics in defect prediction. Among these, change entropy (Hassan, 2009) is widely used at the file level and has proven highly effective. Additionally, past research suggests that co-change patterns provide valuable insights into software quality. Building on these findings, we introduce Co-Change Graph Entropy, a novel metric that models co-changes as a graph to quantify co-change scattering. Experiments on eight Apache projects reveal a significant correlation between co-change entropy and defect counts at the file level, with a Pearson correlation coefficient of up to 0.54. In filelevel defect classification, replacing change entropy with co-change entropy improves AUROC in 72.5% of cases and MCC in 62.5% across 40 experimental settings (five machine learning classifiers and eight projects), though these improvements are not statistically significant. However, when co-change entropy is combined with change entropy, AUROC improves in 82.5% of cases and MCC in 65%, with statistically significant gains confirmed via the Friedman test followed by the post-hoc Nemenyi test. These results indicate that co-change entropy complements change entropy, significantly enhancing defect classification performance and underscoring its practical importance in defect prediction.
Problem

Research questions and friction points this paper is trying to address.

Introducing Co-Change Graph Entropy for defect prediction
Quantifying co-change scattering to improve defect classification
Combining co-change and change entropy enhances prediction accuracy
Innovation

Methods, ideas, or system contributions that make the work stand out.

Models co-changes as a graph for defect prediction
Combines co-change entropy with change entropy
Improves defect classification performance significantly
🔎 Similar Papers
No similar papers found.
E
Ethari Hrishikesh
Indian Institute of Information Technology, Manipur, Mantripukhri, Imphal, India
A
Amit Kumar
Indian Institute of Information Technology Allahabad, Prayagraj, India
M
Meher Bhardwaj
Indian Institute of Information Technology, Manipur, Mantripukhri, Imphal, India
Sonali Agarwal
Sonali Agarwal
Associate Professor, Indian Institute of Information Technology, Allahabad, India
Advance Data MiningBig Data MiningBig Data Storage and Computing ToolsSupport Vector MachinesTwin Support Vector Machine