🤖 AI Summary
Large language models (LLMs) exhibit strong but prompt-sensitive zero-shot ranking performance, where role-playing prompts demonstrate robustness yet lack mechanistic and diversity-aware understanding. This work employs mechanistic interpretability to systematically investigate how role prompts influence LLMs’ relevance judgments in zero-shot ranking. We find that role information is predominantly encoded in early transformer layers, identify a set of critical attention heads responsible for role signal propagation, and show that this signal operates independently of query/document representations—instead modulating the task-instruction pathway. Empirical evaluation confirms that carefully engineered role descriptions significantly improve ranking quality. To our knowledge, this is the first study to uncover the neural mechanisms and intervention pathways underlying role prompting in retrieval. Our findings provide both theoretical foundations and practical guidelines for interpretable, controllable prompt engineering in information retrieval.
📝 Abstract
Large Language Models (LLMs) have emerged as promising zero-shot rankers, but their performance is highly sensitive to prompt formulation. In particular, role-play prompts, where the model is assigned a functional role or identity, often give more robust and accurate relevance rankings. However, the mechanisms and diversity of role-play effects remain underexplored, limiting both effective use and interpretability. In this work, we systematically examine how role-play variations influence zero-shot LLM rankers. We employ causal intervention techniques from mechanistic interpretability to trace how role-play information shapes relevance judgments in LLMs. Our analysis reveals that (1) careful formulation of role descriptions have a large effect on the ranking quality of the LLM; (2) role-play signals are predominantly encoded in early layers and communicate with task instructions in middle layers, while receiving limited interaction with query or document representations. Specifically, we identify a group of attention heads that encode information critical for role-conditioned relevance. These findings not only shed light on the inner workings of role-play in LLM ranking but also offer guidance for designing more effective prompts in IR and beyond, pointing toward broader opportunities for leveraging role-play in zero-shot applications.