🤖 AI Summary
Existing evaluation metrics struggle to comprehensively capture the real-world impact of AI programming assistants on developer productivity. This work proposes a six-factor productivity framework that integrates both short- and long-term dimensions, incorporating human-centric measures—such as technical expertise and sense of work ownership—that have been largely overlooked in prior research. Through a mixed-methods approach involving a survey of 2,989 developers and 11 in-depth interviews, the study systematically examines the multifaceted effects of AI coding assistants. Findings reveal significant divergence among developers regarding the perceived utility of these tools, underscoring the necessity of a multidimensional, human-centered evaluation paradigm. The proposed framework offers a novel foundation for assessing the efficacy of AI-assisted programming in future research and practice.
📝 Abstract
Measuring developer productivity is a topic that has attracted attention from both academic research and industrial practice. In the age of AI coding assistants, it has become even more important for both academia and industry to understand how to measure their impact on developer productivity, and to reconsider whether earlier measures and frameworks still apply. This study analyzes the validity of different approaches to evaluating the productivity impacts of AI coding assistants by leveraging mixed-method research. At BNY Mellon, we conduct a survey with 2989 developer responses and 11 in-depth interviews. Our findings demonstrate that a multifaceted approach is needed to measure AI productivity impacts: survey results expose conflicting perspectives on AI tool usefulness, while interviews elicit six distinct factors that capture both short-term and long-term dimensions of productivity. In contrast to prior work, our factors highlight the importance of long-term metrics like technical expertise and ownership of work. We hope this work encourages future research to incorporate a broader range of human-centered factors, and supports industry in adopting more holistic approaches to evaluating developer productivity.