Teaching LLMs Human-Like Editing of Inappropriate Argumentation via Reinforcement Learning

📅 2026-04-14

📈 Citations: 0

✨ Influential: 0

career value

182K/year

🤖 AI Summary

This work addresses the tendency of large language models to produce fragmented and semantically divergent edits when revising human arguments, lacking the self-contained nature and semantic fidelity characteristic of human editing. To overcome this limitation, the authors propose a reinforcement learning approach that trains models via group relative policy optimization to generate sentence-level, independently adoptable, human-like editing suggestions. The method employs a multi-component reward function integrating semantic similarity, fluency, stylistic consistency, and argumentative appropriateness. Experimental results demonstrate that the proposed approach significantly outperforms existing baselines in both automatic and human evaluations, achieving argumentative appropriateness comparable to full rewrites after multiple editing rounds and substantially improving the human-likeness of the edited outputs.

Technology Category

Application Category

📝 Abstract

Editing human-written text has become a standard use case of large language models (LLMs), for example, to make one's arguments more appropriate for a discussion. Comparing human to LLM-generated edits, however, we observe a mismatch in editing strategies: While LLMs often perform multiple scattered edits and tend to change meaning notably, humans rather encapsulate dependent changes in self-contained, meaning-preserving edits. In this paper, we present a reinforcement learning approach that teaches LLMs human-like editing to improve the appropriateness of arguments. Our approach produces self-contained sentence-level edit suggestions that can be accepted or rejected independently. We train the approach using group relative policy optimization with a multi-component reward function that jointly optimizes edit-level semantic similarity, fluency, and pattern conformity as well as argument-level appropriateness. In automatic and human evaluation, it outperforms competitive baselines and the state of the art in human-like editing, with multi-round editing achieving appropriateness close to full rewriting.

Problem

Research questions and friction points this paper is trying to address.

human-like editing

argumentation appropriateness

large language models

text editing

semantic preservation

Innovation

Methods, ideas, or system contributions that make the work stand out.

reinforcement learning

human-like editing

self-contained edits