The Effect of Code Obfuscation on Human Program Comprehension

📅 2026-03-08

📈 Citations: 0

✨ Influential: 0

career value

175K/year

🤖 AI Summary

This study investigates the impact of code obfuscation on human program comprehension. Through output prediction experiments in Python and JavaScript, it systematically evaluates how multi-level obfuscation techniques—including identifier renaming, adversarial naming, and control flow modifications—affect comprehension accuracy and response time, as well as their interaction with programming experience. The findings reveal that obfuscation generally impairs comprehension efficiency, yet certain renaming strategies in Python unexpectedly enhance understanding. Obfuscation also shifts cognitive strategies from heuristic to more deliberative processing. Moderate thinking time correlates with optimal performance, and while programming experience confers benefits within a language, its transfer across languages is limited. Notably, the relationship between obfuscation strength and comprehension difficulty is non-monotonic and exhibits language-specific characteristics.

Technology Category

Application Category

📝 Abstract

We investigate how code obfuscation influences human understanding of programs through an output-prediction task. To study this effect, we construct multiple levels of obfuscation, ranging from unobfuscated code to transformations involving identifier renaming, adversarially misleading identifiers, control-flow modifications, and combinations of these techniques. These transformations are applied to function-level programs written in Python and JavaScript. Participants were asked to predict program outputs while we recorded correctness, response time, and self-reported programming experience. Our results show that obfuscation generally increases the time required to reason about code and tends to reduce prediction accuracy. However, the relationship between obfuscation strength and performance is not strictly monotonic and varies across programming languages. JavaScript exhibits the expected pattern of increasing difficulty with stronger obfuscation, whereas Python displays a more complex trend in which certain renaming transformations can perform comparably to, or occasionally better than, the unobfuscated baseline. Response-time analyses further suggest that obfuscation shifts participants away from rapid, heuristic reasoning toward slower and more deliberate reasoning processes. Performance appears highest within a moderate range of response times, indicating that careful deliberation can improve accuracy, while extremely long response times often correspond to confusion. Finally, programming experience predicts performance primarily within a given language, with limited transfer across languages, suggesting that obfuscation challenges language-specific familiarity more than general programming ability.

Problem

Research questions and friction points this paper is trying to address.

code obfuscation

program comprehension

human reasoning

output prediction

programming languages

Innovation

Methods, ideas, or system contributions that make the work stand out.

code obfuscation

program comprehension

human reasoning