From Coders to Critics: Empowering Students through Peer Assessment in the Age of AI Copilots

📅 2025-05-28

📈 Citations: 0

✨ Influential: 0

career value

200K/year

🤖 AI Summary

The widespread adoption of AI programming assistants exacerbates assessment distortion, academic integrity violations, and insufficient development of higher-order competencies in programming education. This study empirically evaluates a scalable, anonymized, rubric-driven peer assessment mechanism as a robust alternative to traditional instructor grading—vulnerable to AI-generated submissions—in a large introductory programming course. Employing a mixed-methods design, we quantify inter-rater reliability via correlation (r ≈ 0.6), mean absolute error (MAE), and root mean square error (RMSE), and analyze educational impact through 47 student reflection surveys. Results demonstrate that the framework maintains baseline reliability while significantly enhancing students’ evaluative competence, feedback quality, and course engagement. To our knowledge, this is the first systematic validation in large-scale programming instruction of peer assessment’s dual efficacy in upholding academic integrity and fostering higher-order thinking. We propose a novel peer assessment framework balancing reliability, scalability, and pedagogical value.

Technology Category

Application Category

📝 Abstract

The rapid adoption of AI powered coding assistants like ChatGPT and other coding copilots is transforming programming education, raising questions about assessment practices, academic integrity, and skill development. As educators seek alternatives to traditional grading methods susceptible to AI enabled plagiarism, structured peer assessment could be a promising strategy. This paper presents an empirical study of a rubric based, anonymized peer review process implemented in a large introductory programming course. Students evaluated each other's final projects (2D game), and their assessments were compared to instructor grades using correlation, mean absolute error, and root mean square error (RMSE). Additionally, reflective surveys from 47 teams captured student perceptions of fairness, grading behavior, and preferences regarding grade aggregation. Results show that peer review can approximate instructor evaluation with moderate accuracy and foster student engagement, evaluative thinking, and interest in providing good feedback to their peers. We discuss these findings for designing scalable, trustworthy peer assessment systems to face the age of AI assisted coding.

Problem

Research questions and friction points this paper is trying to address.

Addressing AI's impact on programming education assessment

Exploring peer review as an alternative to traditional grading

Evaluating accuracy and benefits of student peer assessment

Innovation

Methods, ideas, or system contributions that make the work stand out.

Rubric-based anonymized peer review process

Comparison of peer and instructor grades

Reflective surveys on fairness and grading

🔎 Similar Papers

No similar papers found.