Discriminative Finetuning of Generative Large Language Models without Reward Models and Preference Data

📅 2025-02-25
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
How can large language models (LLMs) be better aligned without relying on preference data or reward models? This paper introduces Discriminative Fine-Tuning (DFT), which shifts from the conventional generative token-prediction paradigm to discriminative answer selection. DFT establishes the first probabilistic framework that explicitly models the discriminative likelihood of positive answers relative to the entire output space. Training is performed end-to-end via logits-level suppression of negative answers, obviating reinforcement learning and preference annotations entirely. Experiments demonstrate that DFT significantly outperforms supervised fine-tuning (SFT) across multiple benchmarks, matching or exceeding the performance of SFT→PO (preference-optimized) methods. Crucially, this work provides the first empirical validation of discriminative fine-tuning—without any preference data—as an effective, generalizable, and practical alignment strategy for LLMs.

Technology Category

Application Category

📝 Abstract
Supervised fine-tuning (SFT) followed by preference optimization (PO) denoted by SFT$ ightarrow$PO has become the standard for improving pretrained large language models (LLMs), with PO demonstrating significant performance gains. However, PO methods rely on either human-labeled preference data or a strong reward model to generate preference data. Can we fine-tune LLMs without preference data or reward models while achieving competitive performance to SFT$ ightarrow$PO? We address this question by introducing Discriminative Fine-Tuning (DFT), a novel approach that eliminates the need for preference data. Unlike SFT, which employs a generative approach and overlooks negative data, DFT adopts a discriminative paradigm that that increases the probability of positive answers while suppressing potentially negative ones, shifting from token prediction to data prediction. Our contributions include: (i) a discriminative probabilistic framework for fine-tuning LLMs by explicitly modeling the discriminative likelihood of an answer among all possible outputs given an input; (ii) efficient algorithms to optimize this discriminative likelihood; and (iii) extensive experiments demonstrating DFT's effectiveness, achieving performance better than SFT and comparable to if not better than SFT$ ightarrow$PO. The code can be found at https://github.com/PenGuln/DFT.
Problem

Research questions and friction points this paper is trying to address.

Fine-tune LLMs without preference data
Eliminate need for reward models
Achieve competitive performance to SFT→PO
Innovation

Methods, ideas, or system contributions that make the work stand out.

Discriminative Fine-Tuning eliminates preference data
DFT shifts from token to data prediction
DFT outperforms SFT, matches SFT-PO performance
🔎 Similar Papers
No similar papers found.
Siqi Guo
Siqi Guo
PhD Student, Purdue University
HAIHCIVRintelligent virtual agentsembodied conversational agents
Ilgee Hong
Ilgee Hong
Georgia Institute of Technology
Machine LearningLarge Language Models
V
Vicente Balmaseda
Department of Computer Science and Engineering, Texas A&M University, College Station, USA
T
Tuo Zhao
H. Milton Stewart School of Industrial and Systems Engineering, Georgia Institute of Technology, Atlanta, USA
Tianbao Yang
Tianbao Yang
Texas A&M University
machine learningstochastic optimization