Fine-Tuning LLMs with Fine-Grained Human Feedback on Text Spans

📅 2025-12-29

📈 Citations: 0

✨ Influential: 0

career value

160K/year

🤖 AI Summary

This work addresses the challenge of efficiently leveraging fine-grained human feedback for aligning large language models (LLMs). We propose a text-span-level fine-tuning method wherein annotators provide binary “like/dislike” judgments—along with rationales—on local spans of generated text. This drives left-to-right, iterative segment-wise rewriting, yielding a traceable, incremental revision chain. Adjacent revision steps are automatically paired to construct local preference pairs, which are optimized via Direct Preference Optimization (DPO) for structured alignment. Unlike conventional A/B ranking or full-sentence rewrites, our approach decouples global preference learning into localized, stepwise, and interpretable alignment subtasks. Empirical results demonstrate significant improvements in both alignment accuracy and training efficiency across multiple generation quality and user-preference matching metrics. To our knowledge, this is the first method enabling fine-grained, traceable, feedback-driven generative preference modeling.

Technology Category

Application Category

📝 Abstract

We present a method and dataset for fine-tuning language models with preference supervision using feedback-driven improvement chains. Given a model response, an annotator provides fine-grained feedback by marking ``liked'' and ``disliked'' spans and specifying what they liked or disliked about them. The base model then rewrites the disliked spans accordingly, proceeding from left to right, forming a sequence of incremental improvements. We construct preference pairs for direct alignment from each adjacent step in the chain, enabling the model to learn from localized, targeted edits. We find that our approach outperforms direct alignment methods based on standard A/B preference ranking or full contrastive rewrites, demonstrating that structured, revision-based supervision leads to more efficient and effective preference tuning.

Problem

Research questions and friction points this paper is trying to address.

Fine-tuning LLMs with fine-grained human feedback on text spans

Constructing preference pairs from incremental improvement chains

Outperforming standard A/B preference ranking methods

Innovation

Methods, ideas, or system contributions that make the work stand out.

Fine-grained human feedback on text spans

Feedback-driven improvement chains for rewriting

Preference pairs from incremental revision steps

🔎 Similar Papers

No similar papers found.