Efficient Personalization of Generative Models via Optimal Experimental Design

📅 2025-12-22
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address the high cost and low data efficiency of acquiring human feedback in generative model personalization, this paper proposes an optimal experimental design (OED)-based preference query selection method for efficient inference of users’ implicit reward functions. We are the first to integrate OED theory into the Bayesian preference learning framework, formulating an information-maximization objective cast as a convex optimization problem. Based on this, we develop ED-PBRL—a scalable algorithm supporting structured query generation (e.g., text and image prompts). In text-to-image style personalization, ED-PBRL substantially reduces the number of required preference queries and achieves superior data efficiency over random sampling, with both theoretical guarantees and empirical validation. Our core contribution lies in the principled unification of OED and preference learning, establishing a novel paradigm for implicit reward modeling under minimal feedback constraints.

Technology Category

Application Category

📝 Abstract
Preference learning from human feedback has the ability to align generative models with the needs of end-users. Human feedback is costly and time-consuming to obtain, which creates demand for data-efficient query selection methods. This work presents a novel approach that leverages optimal experimental design to ask humans the most informative preference queries, from which we can elucidate the latent reward function modeling user preferences efficiently. We formulate the problem of preference query selection as the one that maximizes the information about the underlying latent preference model. We show that this problem has a convex optimization formulation, and introduce a statistically and computationally efficient algorithm ED-PBRL that is supported by theoretical guarantees and can efficiently construct structured queries such as images or text. We empirically present the proposed framework by personalizing a text-to-image generative model to user-specific styles, showing that it requires less preference queries compared to random query selection.
Problem

Research questions and friction points this paper is trying to address.

Optimizes query selection for human feedback
Efficiently personalizes generative models via preference learning
Reduces required queries for aligning models with users
Innovation

Methods, ideas, or system contributions that make the work stand out.

Optimal experimental design for informative preference queries
Convex optimization formulation for query selection
ED-PBRL algorithm efficiently constructs structured queries
G
Guy Schacht
ETH Zurich, Switzerland
Z
Ziyad Sheebaelhamd
University of Tübingen, Germany
Riccardo De Santi
Riccardo De Santi
ETH AI Center
Generative OptimizationScientific DiscoveryReinforcement LearningMachine Learning
Mojmír Mutný
Mojmír Mutný
ETH
OptimizationMachine LearningBanditsActive LearningExperiment Design
A
Andreas Krause
ETH Zurich, Switzerland ETH AI Center, Switzerland