🤖 AI Summary
Current large language models (LLMs) struggle to adapt to individual user preferences in creative writing, as their training typically homogenizes diverse preferences. Method: We introduce the first personalized dataset for creative writing—comprising 60 participants—with concurrently collected self-reported stated preferences and behaviorally grounded revealed preferences derived from pairwise short-text choices. We empirically identify systematic discrepancies between these preference modalities, demonstrating that revealed preferences are more predictive and thus better suited for modeling. We propose a Transformer-based individual preference modeling framework, augmented with LLM-driven interpretability analysis to characterize fine-grained preference heterogeneity. Contribution/Results: Our method achieves 75.8% accuracy in individual-level and 67.7% in group-level preference prediction. These results validate both the dataset’s utility and the feasibility of personalized preference modeling, establishing a new user-centered paradigm and foundational resource for preference-aware creative text generation with LLMs.
📝 Abstract
People have different creative writing preferences, and large language models (LLMs) for these tasks can benefit from adapting to each user's preferences. However, these models are often trained over a dataset that considers varying personal tastes as a monolith. To facilitate developing personalized creative writing LLMs, we introduce LiteraryTaste, a dataset of reading preferences from 60 people, where each person: 1) self-reported their reading habits and tastes (stated preference), and 2) annotated their preferences over 100 pairs of short creative writing texts (revealed preference). With our dataset, we found that: 1) people diverge on creative writing preferences, 2) finetuning a transformer encoder could achieve 75.8% and 67.7% accuracy when modeling personal and collective revealed preferences, and 3) stated preferences had limited utility in modeling revealed preferences. With an LLM-driven interpretability pipeline, we analyzed how people's preferences vary. We hope our work serves as a cornerstone for personalizing creative writing technologies.