How role-play shapes relevance judgment in zero-shot LLM rankers

📅 2025-10-20

📈 Citations: 0

✨ Influential: 0

career value

205K/year

🤖 AI Summary

Large language models (LLMs) exhibit strong but prompt-sensitive zero-shot ranking performance, where role-playing prompts demonstrate robustness yet lack mechanistic and diversity-aware understanding. This work employs mechanistic interpretability to systematically investigate how role prompts influence LLMs’ relevance judgments in zero-shot ranking. We find that role information is predominantly encoded in early transformer layers, identify a set of critical attention heads responsible for role signal propagation, and show that this signal operates independently of query/document representations—instead modulating the task-instruction pathway. Empirical evaluation confirms that carefully engineered role descriptions significantly improve ranking quality. To our knowledge, this is the first study to uncover the neural mechanisms and intervention pathways underlying role prompting in retrieval. Our findings provide both theoretical foundations and practical guidelines for interpretable, controllable prompt engineering in information retrieval.

Technology Category

Application Category

📝 Abstract

Large Language Models (LLMs) have emerged as promising zero-shot rankers, but their performance is highly sensitive to prompt formulation. In particular, role-play prompts, where the model is assigned a functional role or identity, often give more robust and accurate relevance rankings. However, the mechanisms and diversity of role-play effects remain underexplored, limiting both effective use and interpretability. In this work, we systematically examine how role-play variations influence zero-shot LLM rankers. We employ causal intervention techniques from mechanistic interpretability to trace how role-play information shapes relevance judgments in LLMs. Our analysis reveals that (1) careful formulation of role descriptions have a large effect on the ranking quality of the LLM; (2) role-play signals are predominantly encoded in early layers and communicate with task instructions in middle layers, while receiving limited interaction with query or document representations. Specifically, we identify a group of attention heads that encode information critical for role-conditioned relevance. These findings not only shed light on the inner workings of role-play in LLM ranking but also offer guidance for designing more effective prompts in IR and beyond, pointing toward broader opportunities for leveraging role-play in zero-shot applications.

Problem

Research questions and friction points this paper is trying to address.

Examining how role-play prompts affect LLM ranking performance

Analyzing mechanisms of role-play in relevance judgment formation

Identifying neural encoding patterns for role-conditioned relevance assessment

Innovation

Methods, ideas, or system contributions that make the work stand out.

Role-play prompts enhance zero-shot LLM ranking performance

Causal intervention traces role-play effects on relevance judgments

Attention heads encode role-conditioned relevance information in layers

🔎 Similar Papers

An Investigation of Prompt Variations for Zero-shot LLM-based Rankers