Tuning-Free Personalized Alignment via Trial-Error-Explain In-Context Learning

📅 2025-02-13

📈 Citations: 0

✨ Influential: 0

career value

153K/year

🤖 AI Summary

Existing large language models exhibit generic output styles, limiting their adaptability to individual user preferences. This paper proposes Trial-and-Error Explanatory In-Context Learning (TICL), a fine-tuning-free approach enabling personalized text generation with ≤10 user-provided examples. Its core innovation is an automatic three-stage prompting expansion mechanism—“Trial–Error–Explanation”—which leverages self-generated negative samples to enhance discriminative capability and incorporates fine-grained stylistic explanations to mitigate inherent biases toward formalized expressions in zero-shot settings. TICL integrates style-aware prompt engineering, LLM-as-a-judge evaluation, and iterative reasoning guidance. Empirical evaluation across email composition, argumentative essay writing, and news generation shows TICL achieves a 91.5% win rate against state-of-the-art baselines. Lexical and human evaluations further confirm significant improvements in stylistic consistency and personalized expression fidelity.

Technology Category

Application Category

📝 Abstract

Language models are aligned to the collective voice of many, resulting in generic outputs that do not align with specific users' styles. In this work, we present Trial-Error-Explain In-Context Learning (TICL), a tuning-free method that personalizes language models for text generation tasks with fewer than 10 examples per user. TICL iteratively expands an in-context learning prompt via a trial-error-explain process, adding model-generated negative samples and explanations that provide fine-grained guidance towards a specific user's style. TICL achieves favorable win rates on pairwise comparisons with LLM-as-a-judge up to 91.5% against the previous state-of-the-art and outperforms competitive tuning-free baselines for personalized alignment tasks of writing emails, essays and news articles. Both lexical and qualitative analyses show that the negative samples and explanations enable language models to learn stylistic context more effectively and overcome the bias towards structural and formal phrases observed in their zero-shot outputs. By front-loading inference compute to create a user-specific in-context learning prompt that does not require extra generation steps at test time, TICL presents a novel yet simple approach for personalized alignment.

Problem

Research questions and friction points this paper is trying to address.

Personalizes language models for text generation

Reduces bias towards generic outputs

Improves stylistic alignment with user preferences

Innovation

Methods, ideas, or system contributions that make the work stand out.

Trial-Error-Explain In-Context Learning

User-specific prompt expansion

Model-generated negative samples

🔎 Similar Papers

Is Free Self-Alignment Possible?