Verifier Warnings Do Not Improve Comprehensibility Prediction

📅 2026-04-24
📈 Citations: 0
Influential: 0
📄 PDF

career value

189K/year
🤖 AI Summary
This study investigates whether formal verifier warnings can serve as effective semantic features to enhance the performance of code understandability prediction models. Through a controlled treatment-control experiment, the total number of verifier warnings is incorporated into a machine learning model and compared against a baseline model that relies solely on syntactic and developer-related features. This work presents the first empirical evaluation of the predictive added value of verifier warnings for human-judged code understandability. The results indicate that this feature does not yield a statistically significant performance improvement, thereby challenging the common assumption that verifier warnings constitute a highly discriminative semantic signal. These findings offer new empirical evidence for the selection of semantic features in modeling code understandability.

Technology Category

Application Category

📝 Abstract
Proponents of software verification suggest that code simplicity is linked to the effort to verify code, hypothesizing that formal verifiers produce fewer false positive warnings and require less manual intervention when analyzing simpler code. A recent meta-analysis study found empirical support for this hypothesis: a small correlation between the sum of verifier warnings and human-derived code comprehensibility metrics. Based on this finding, we conjectured that using the sum of verifier tool (verifier) warnings to represent program semantic information as an input feature to machine learning (ML) models for code comprehensibility prediction can enhance their performance, when combined with traditional syntactic and developer features. To test this conjecture, we performed a control-treatment experiment incorporating the verifier warning sum feature into machine learning models from the literature, and conducted a comparative analysis of their performance against models trained only on syntactic and developer features. We found no significant difference in the prediction performance of models with and without the warnings feature. Our findings suggest that while a correlation exists, the verifier warning sum offers limited discriminative power: combining syntactic and developer features is just as effective for predicting human-judged code comprehensibility.
Problem

Research questions and friction points this paper is trying to address.

code comprehensibility
verifier warnings
machine learning
program semantics
software verification
Innovation

Methods, ideas, or system contributions that make the work stand out.

verifier warnings
code comprehensibility
machine learning
empirical evaluation
program semantics
🔎 Similar Papers
No similar papers found.