Understanding Teacher Revisions of Large Language Model-Generated Feedback

📅 2026-03-29

📈 Citations: 0

✨ Influential: 0

career value

174K/year

🤖 AI Summary

This study investigates how teachers revise student feedback generated by large language models and the resulting impact on instructional content. Drawing on revision data from 117 teachers across 1,349 AI-generated feedback instances, the research integrates sentence embeddings with a machine learning classifier (AUC = 0.75), quantitative analysis, and qualitative coding to systematically characterize teachers’ actual revision behaviors for the first time. Findings reveal that approximately 80% of AI-generated feedback remains unedited, revised feedback tends to be shorter, only about 10% of teachers frequently modify the content, and educators prefer simplifying feedback with high information density. These results uncover key behavioral patterns in human-AI collaborative feedback processes and offer design implications for minimizing unnecessary editing efforts.

Technology Category

Application Category

📝 Abstract

Large language models (LLMs) increasingly generate formative feedback for students, yet little is known about how teachers revise this feedback before it reaches learners. Teachers' revisions shape what students receive, making revision practices central to evaluating AI classroom tools. We analyze a dataset of 1,349 instances of AI-generated feedback and corresponding teacher-edited explanations from 117 teachers. We examine (i) textual characteristics associated with teacher revisions, (ii) whether revision decisions can be predicted from the AI feedback text, and (iii) how revisions change the pedagogical type of feedback delivered. First, we find that teachers accept AI feedback without modification in about 80% of cases, while edited feedback tends to be significantly longer and subsequently shortened by teachers. Editing behavior varies substantially across teachers: about 50% never edit AI feedback, and only about 10% edit more than two-thirds of feedback instances. Second, machine learning models trained only on the AI feedback text as input features, using sentence embeddings, achieve fair performance in identifying which feedback will be revised (AUC=0.75). Third, qualitative coding shows that when revisions occur, teachers often simplify AI-generated feedback, shifting it away from high-information explanations toward more concise, corrective forms. Together, these findings characterize how teachers engage with AI-generated feedback in practice and highlight opportunities to design feedback systems that better align with teacher priorities while reducing unnecessary editing effort.

Problem

Research questions and friction points this paper is trying to address.

teacher revisions

large language models

formative feedback

AI-generated feedback

pedagogical feedback

Innovation

Methods, ideas, or system contributions that make the work stand out.

teacher revision

large language models

formative feedback