Aligning LLMs by Predicting Preferences from User Writing Samples

📅 2025-05-27

📈 Citations: 0

✨ Influential: 0

career value

221K/year

🤖 AI Summary

Existing preference inference methods yield highly generalizable but poorly discriminative preference descriptions, failing to accurately capture user-specific preferences. To address this, we propose PROSE—a novel framework that introduces, for the first time, an iterative refinement mechanism for preference descriptions coupled with cross-sample consistency verification. PROSE synergistically integrates multi-example cross-validation, preference modeling, and in-context learning (ICL) to optimize preference inference across mainstream LLMs—including Qwen2.5 and GPT-4o. On summarization and email writing tasks, PROSE achieves a 33% improvement in preference inference accuracy over the state-of-the-art CIPHER. When combined with ICL, it further boosts generation quality by 9%. This work significantly advances the capability of LLM agents to model fine-grained user preferences and improve alignment with individual user intent.

Technology Category

Application Category

📝 Abstract

Accommodating human preferences is essential for creating aligned LLM agents that deliver personalized and effective interactions. Recent work has shown the potential for LLMs acting as writing agents to infer a description of user preferences. Agent alignment then comes from conditioning on the inferred preference description. However, existing methods often produce generic preference descriptions that fail to capture the unique and individualized nature of human preferences. This paper introduces PROSE, a method designed to enhance the precision of preference descriptions inferred from user writing samples. PROSE incorporates two key elements: (1) iterative refinement of inferred preferences, and (2) verification of inferred preferences across multiple user writing samples. We evaluate PROSE with several LLMs (i.e., Qwen2.5 7B and 72B Instruct, GPT-mini, and GPT-4o) on a summarization and an email writing task. We find that PROSE more accurately infers nuanced human preferences, improving the quality of the writing agent's generations over CIPHER (a state-of-the-art method for inferring preferences) by 33%. Lastly, we demonstrate that ICL and PROSE are complementary methods, and combining them provides up to a 9% improvement over ICL alone.

Problem

Research questions and friction points this paper is trying to address.

Improves precision of inferred user preferences from writing samples

Addresses generic preference descriptions in existing LLM alignment methods

Enhances personalized interactions by refining and verifying user preferences

Innovation

Methods, ideas, or system contributions that make the work stand out.

Iterative refinement of inferred preferences

Verification across multiple writing samples

Combining ICL and PROSE for improvement

🔎 Similar Papers

No similar papers found.