AIGCodeSet: A New Annotated Dataset for AI Generated Code Detection

📅 2024-12-21

🏛️ arXiv.org

📈 Citations: 0

✨ Influential: 0

career value

185K/year

🤖 AI Summary

This work addresses the critical ethical challenge of detecting AI-generated code. We introduce AIGCodeSet, the first large-scale, multi-model-balanced Python code annotation dataset, comprising 28,280 AI-generated and 47,550 human-written samples. To establish a rigorous cross-model generalization benchmark, we systematically integrate code generated by CodeLlama, Codestral, and Gemini 1.5 Flash—marking the first such comprehensive evaluation framework. Methodologically, we propose a Bayesian classifier leveraging statistical token-level features of source code, achieving significant improvements in both accuracy and cross-model robustness over conventional machine learning and LLM-based baselines. Our approach is computationally efficient, inherently interpretable, and exhibits strong generalization across diverse generative models. Collectively, this work delivers a reliable, scalable technical foundation for academic integrity assessment and ethical governance in software engineering.

Technology Category

Application Category

📝 Abstract

While large language models provide significant convenience for software development, they can lead to ethical issues in job interviews and student assignments. Therefore, determining whether a piece of code is written by a human or generated by an artificial intelligence (AI) model is a critical issue. In this study, we present AIGCodeSet, which consists of 2.828 AI-generated and 4.755 human-written Python codes, created using CodeLlama 34B, Codestral 22B, and Gemini 1.5 Flash. In addition, we share the results of our experiments conducted with baseline detection methods. Our experiments show that a Bayesian classifier outperforms the other models.

Problem

Research questions and friction points this paper is trying to address.

Detecting AI-generated code to address ethical concerns.

Developing a dataset for distinguishing human vs AI code.

Evaluating detection methods with a Bayesian classifier.

Innovation

Methods, ideas, or system contributions that make the work stand out.

AIGCodeSet: annotated dataset for AI code detection

Includes AI and human-written Python code samples

Bayesian classifier outperforms in detection accuracy

🔎 Similar Papers

Detecting AI-Generated Text: Factors Influencing Detectability with Current Methods

2024-06-21Journal of Artificial Intelligence ResearchCitations: 6

💼 Related Jobs

Artificial Intelligence Engineer

Booz Allen Hamilton

$77,600.00 to $176,000.00 (annualized USD)

Herndon, VA

Authors to Follow