LLM-as-a-Judge: Toward World Models for Slate Recommendation Systems

📅 2025-11-06
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This study addresses the challenge of cross-domain user preference modeling in multi-domain slate recommendation. Methodologically, it pioneers the use of large language models (LLMs) as “world models” of user preferences, explicitly encoding preference functions via LLMs’ pairwise comparison capability—without fine-tuning—enabling zero-shot generalization across multiple tasks and datasets. Key contributions are: (1) establishing LLMs as a novel paradigm for universal preference world modeling; (2) empirically validating effectiveness on multiple slate recommendation benchmarks, and revealing correlations between performance and intrinsic preference properties—such as smoothness and transitivity; and (3) identifying alignment between prompt design and underlying preference structure as a critical optimization axis. Results demonstrate substantial improvements in both interpretability and cross-domain generalization for sequential slate recommendation.

Technology Category

Application Category

📝 Abstract
Modeling user preferences across domains remains a key challenge in slate recommendation (i.e. recommending an ordered sequence of items) research. We investigate how Large Language Models (LLM) can effectively act as world models of user preferences through pairwise reasoning over slates. We conduct an empirical study involving several LLMs on three tasks spanning different datasets. Our results reveal relationships between task performance and properties of the preference function captured by LLMs, hinting towards areas for improvement and highlighting the potential of LLMs as world models in recommender systems.
Problem

Research questions and friction points this paper is trying to address.

Modeling user preferences across domains in slate recommendation
Using LLMs as world models through pairwise reasoning
Investigating task performance relationships with preference function properties
Innovation

Methods, ideas, or system contributions that make the work stand out.

LLMs act as world models for user preferences
Using pairwise reasoning over recommendation slates
Empirical study across multiple tasks and datasets
🔎 Similar Papers
No similar papers found.
B
Baptiste Bonin
Université Laval (IID), Mila – Quebec AI Institute
M
M. Heuillet
Université Laval (IID), Mila – Quebec AI Institute
Audrey Durand
Audrey Durand
Assistant Professor, Université Laval, Canada
banditsreinforcement learninghealth informatics