A Large-Scale Comprehensive Measurement of AI-Generated Code in Real-World Repositories A Large-Scale Comprehensive Measurement of AI-Generated Code in Real-World Repositories

📅 2026-03-28

📈 Citations: 0

✨ Influential: 0

career value

191K/year

🤖 AI Summary

This study addresses the lack of large-scale empirical analysis on AI-generated code in real-world software repositories. By constructing a large dataset of authentic code repositories and combining heuristic filtering with large language model–based classification, this work systematically identifies and measures multidimensional characteristics of AI-generated code in realistic development contexts for the first time. It evaluates such code comprehensively from both code-level and commit-level perspectives, examining complexity, structural properties, defect rates, and developer behavior. The findings reveal that AI-generated code significantly differs from human-written code in structural simplicity, complexity distribution, and commit stability, offering critical empirical evidence for understanding the practical impact of AI-assisted programming.

Technology Category

Application Category

📝 Abstract

Large language models (LLMs) are rapidly transforming software engineering by enabling developers to generate code ranging from small snippets to entire projects. As AI-generated code becomes increasingly integrated into real-world systems, understanding its characteristics and impact is critical. However, prior work primarily focuses on small-scale, controlled evaluations and lacks comprehensive analysis in real-world settings. In this paper, we present a large-scale empirical study of AI-generated code in real-world repositories. We analyze both code-level metrics (\eg complexity, structure, and defect-related indicators) and commit-level characteristics (\eg commit size, frequency, and post-commit stability). To enable this study, we develop heuristic filter with LLM classification to identify AI-generated code and construct a large dataset. Our results provide new insights into how AI-generated code differs from human-written code and how AI assistance influences development practices. These findings contribute to a deeper understanding of the practical implications of AI-assisted programming.

Problem

Research questions and friction points this paper is trying to address.

AI-generated code

large-scale measurement

real-world repositories

empirical study

software engineering

Innovation

Methods, ideas, or system contributions that make the work stand out.

AI-generated code

large-scale empirical study

LLM classification