Step Rejection Fine-Tuning: A Practical Distillation Recipe

📅 2026-05-11

📈 Citations: 0

✨ Influential: 0

career value

172K/year

🤖 AI Summary

This work addresses the inefficiency of conventional rejection fine-tuning (RFT), which discards entire unsuccessful trajectories during training of large language model agents, thereby wasting potentially useful information—especially on challenging tasks. To mitigate this, the authors propose Step-level Rejection Fine-Tuning (SRFT), which employs a critic model to perform fine-grained evaluation of each step within a trajectory. SRFT introduces a step-level loss masking mechanism that suppresses the loss from erroneous steps while preserving their contextual information, enabling the model to learn error correction and recovery from partially correct behaviors. This approach innovatively leverages valid segments within unsolved trajectories rather than discarding them entirely. Experimental results demonstrate that SRFT achieves a 3.7% absolute improvement in accuracy on SWE-bench Verified (reaching 32.2%), substantially outperforming the 2.4% gain obtained by traditional RFT.

📝 Abstract

Rejection Fine-Tuning (RFT) is a standard method for training LLM agents, where unsuccessful trajectories are discarded from the training set. In the context of SWE-bench tasks, this corresponds to filtering out runs where the submitted patch does not pass the tests. However, this approach discards unresolved trajectories, even though they form a large portion of all trajectories for hard tasks and even then may be partially correct. In this work, we propose Step Rejection Fine-Tuning (SRFT) - a practical way to leverage these unresolved trajectories. For this, we employ a critic LLM to assess the correctness of each step in a trajectory. Consequently, during training, we mask the loss for erroneous steps while retaining them in the context window. This way we ensure the model learns to recover from errors without reproducing them. Evaluation on SWE-bench Verified shows that while RFT improves the resolution rate by 2.4% by excluding unresolved trajectories, SRFT improves it by 3.7% by filtering them instead of discarding completely, reaching the total resolution rate of 32.2%.

Problem

Research questions and friction points this paper is trying to address.

Rejection Fine-Tuning

LLM agents

trajectory filtering

SWE-bench

unresolved trajectories

Innovation

Methods, ideas, or system contributions that make the work stand out.

Step Rejection Fine-Tuning

LLM agent training

trajectory filtering

critic-guided masking

SWE-bench

🔎 Similar Papers

No similar papers found.

💼 Related Jobs

Machine Learning Engineer - Agentic AI

Apple

Sunnyvale, United States of America

Authors to Follow