🤖 AI Summary
This study investigates the differential impact of AI-powered code suggestions—exemplified by GitHub Copilot—on developer frustration and productivity. Adopting a quasi-experimental design, it combines IDE interaction logs, screen recordings, and heart rate variability (HRV) physiological sensing to compare expert and novice developers’ task performance with and without code suggestions enabled. The work introduces and empirically validates the theory that *developer experience level moderates the utility of AI suggestions*: experts experience increased frustration and reduced efficiency due to irrelevant or distracting suggestions, whereas novices benefit significantly through reduced cognitive load, higher task completion rates, and faster execution. Paired t-tests and interaction-effect analyses confirm statistically significant moderation (p < 0.01). This research establishes the first causal, evidence-based theoretical framework for personalizing AI programming tools, designing human-AI collaborative interfaces, and informing developer support strategies. (149 words)
📝 Abstract
Context. AI-based development tools, such as GitHub Copilot, are transforming the software development process by offering real-time code suggestions. These tools promise to improve the productivity by reducing cognitive load and speeding up task completion. Previous exploratory studies, however, show that developers sometimes perceive the automatic suggestions as intrusive. As a result, they feel like their productivity decreased. Theory. We propose two theories on the impact of automatic suggestions on frustration and productivity. First, we hypothesize that experienced developers are frustrated from automatic suggestions (mostly from irrelevant ones), and this also negatively impacts their productivity. Second, we conjecture that novice developers benefit from automatic suggestions, which reduce the frustration caused from being stuck on a technical problem and thus increase their productivity. Objective. We plan to conduct a quasi-experimental study to test our theories. The empirical evidence we will collect will allow us to either corroborate or reject our theories. Method. We will involve at least 32 developers, both experts and novices. We will ask each of them to complete two software development tasks, one with automatic suggestions enabled and one with them disabled, allowing for within-subject comparisons. We will measure independent and dependent variables by monitoring developers' actions through an IDE plugin and screen recording. Besides, we will collect physiological data through a wearable device. We will use statistical hypothesis tests to study the effects of the treatments (i.e., automatic suggestions enabled/disabled) on the outcomes (frustration and productivity).