An Empirical Study on Preference Tuning Generalization and Diversity Under Domain Shift

📅 2026-01-09
🏛️ arXiv.org
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Preference tuning often suffers significant performance and utility degradation in cross-domain scenarios due to distributional shift. This work systematically investigates the generalization capabilities of various preference alignment objectives under domain transfer and, for the first time, compares multiple adaptation strategies on summarization and question-answering tasks. Through the incorporation of pseudo-labeling, supervised fine-tuning, and multi-objective alignment, our experiments demonstrate that pseudo-labeling effectively mitigates performance deterioration caused by domain shift and substantially enhances cross-domain generalization. This study provides both empirical foundations and practical solutions for robust domain transfer in preference-based fine-tuning.

Technology Category

Application Category

📝 Abstract
Preference tuning aligns pretrained language models to human judgments of quality, helpfulness, or safety by optimizing over explicit preference signals rather than likelihood alone. Prior work has shown that preference-tuning degrades performance and reduces helpfulness when evaluated outside the training domain. However, the extent to which adaptation strategies mitigate this domain shift remains unexplored. We address this challenge by conducting a comprehensive and systematic study of alignment generalization under domain shift. We compare five popular alignment objectives and various adaptation strategies from source to target, including target-domain supervised fine-tuning and pseudo-labeling, across summarization and question-answering helpfulness tasks. Our findings reveal systematic differences in generalization across alignment objectives under domain shift. We show that adaptation strategies based on pseudo-labeling can substantially reduce domain-shift degradation
Problem

Research questions and friction points this paper is trying to address.

preference tuning
domain shift
generalization
alignment
diversity
Innovation

Methods, ideas, or system contributions that make the work stand out.

preference tuning
domain shift
alignment generalization
pseudo-labeling
empirical study
🔎 Similar Papers
No similar papers found.
C
Constantinos F. Karouzos
School of Computer Science, University of Sheffield, UK
Xingwei Tan
Xingwei Tan
Research Associate
Natural Language Processing
N
Nikolaos Aletras
School of Computer Science, University of Sheffield, UK