Corrector Sampling in Language Models

📅 2025-06-06
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Autoregressive language models suffer from error accumulation due to their unidirectional generation mechanism. To address this, we propose Resample-Previous-Tokens (RPT), the first plug-and-play local resampling method integrated into standard autoregressive decoding—without modifying the model architecture. RPT iteratively backtracks and resamples previously generated tokens within a sliding window, enabling inference-time correction in a zero-fine-tuning setting; it also supports lightweight fine-tuning (using only ~100B tokens) for further gains. Evaluated on an 8B-parameter model, RPT achieves approximately 10% relative improvement on both programming and general reasoning benchmarks. It effectively mitigates error propagation while preserving decoding efficiency, striking a favorable balance between correction capability and computational overhead.

Technology Category

Application Category

📝 Abstract
Autoregressive language models accumulate errors due to their fixed, irrevocable left-to-right token generation. To address this, we propose a new sampling method called Resample-Previous-Tokens (RPT). RPT mitigates error accumulation by iteratively revisiting and potentially replacing tokens in a window of previously generated text. This method can be integrated into existing autoregressive models, preserving their next-token-prediction quality and speed. Fine-tuning a pretrained 8B parameter model with RPT for only 100B resulted in ~10% relative improvements on reasoning and coding benchmarks compared to the standard sampling.
Problem

Research questions and friction points this paper is trying to address.

Autoregressive models accumulate errors in left-to-right generation
Proposes RPT to iteratively correct previous token errors
Improves reasoning and coding benchmarks by 10%
Innovation

Methods, ideas, or system contributions that make the work stand out.

Iterative token resampling for error correction
Window-based previous token replacement
Minimal fine-tuning for performance boost
🔎 Similar Papers
No similar papers found.