Nearly Optimal Active Preference Learning and Its Application to LLM Alignment

📅 2026-02-02

📈 Citations: 0

✨ Influential: 0

career value

197K/year

🤖 AI Summary

This work addresses the high cost of acquiring human preference data for aligning large language models by introducing active learning methods tailored specifically to the structural characteristics of preference learning. Moving beyond conventional G- or D-optimality criteria, the paper proposes two novel algorithms: the first provides the first instance-dependent theoretical guarantee on label complexity for preference learning, and the second offers an efficient greedy query strategy suitable for practical deployment. Experiments on real-world preference datasets demonstrate that the proposed approaches substantially improve sample efficiency, achieving strong empirical performance while maintaining rigorous theoretical foundations.

Technology Category

Application Category

📝 Abstract

Aligning large language models (LLMs) depends on high-quality datasets of human preference labels, which are costly to collect. Although active learning has been studied to improve sample efficiency relative to passive collection, many existing approaches adopt classical experimental design criteria such as G- or D-optimality. These objectives are not tailored to the structure of preference learning, leaving open the design of problem-specific algorithms. In this work, we identify a simple intuition specific to preference learning that calls into question the suitability of these existing design objectives. Motivated by this insight, we propose two active learning algorithms. The first provides the first instance-dependent label complexity guarantee for this setting, and the second is a simple, practical greedy method. We evaluate our algorithm on real-world preference datasets and observe improved sample efficiency compared to existing methods.

Problem

Research questions and friction points this paper is trying to address.

active preference learning

LLM alignment

sample efficiency

preference learning

active learning

Innovation

Methods, ideas, or system contributions that make the work stand out.

active preference learning

instance-dependent label complexity

LLM alignment