Edit-level Majority Voting Mitigates Over-Correction in LLM-based Grammatical Error Correction

📅 2026-05-13

📈 Citations: 0

✨ Influential: 0

career value

157K/year

🤖 AI Summary

This work addresses the issue of over-correction in grammatical error correction by large language models, which often undermines accuracy. The authors propose a training-free inference method that, for the first time, applies majority voting at the edit level across multiple candidates generated by a single model. This approach requires no model modification or additional training and is evaluated through a comparative analysis combining greedy decoding and Minimum Bayes Risk (MBR) decoding. Experiments on nine benchmark datasets spanning seven languages demonstrate that the method consistently and significantly outperforms existing decoding strategies while exhibiting robustness across diverse instruction prompts.

📝 Abstract

Grammatical error correction using large language models often suffers from the over-correction issue. To mitigate this, we propose a training-free inference method that performs edit-level majority voting over multiple candidates generated by a single model, without requiring model modifications or additional training. Across nine benchmarks covering English, Czech, German, Ukrainian, Korean, Hindi, and Romanian, the proposed method outperforms both greedy and MBR decoding in most cases. Moreover, it yields stable correction quality regardless of the instruction prompts used. We release two repository supporting GEC datasets loading and LLM inference.

Problem

Research questions and friction points this paper is trying to address.

grammatical error correction

over-correction

large language models

edit-level voting

Innovation

Methods, ideas, or system contributions that make the work stand out.

edit-level majority voting

over-correction mitigation

training-free inference

grammatical error correction