🤖 AI Summary
This study addresses the pronounced sensitivity of large language models to prompt ordering in multiple-choice questions, a phenomenon whose underlying mechanism remains unclear. Through systematic architectural analysis, controlled experiments, and attention visualization, we reveal that the causal attention mask—when prompts follow the Question-Option-Context (QOC) order—prevents options from attending to the context, creating an information bottleneck that fundamentally degrades performance. We demonstrate that reordering prompts into Context-Question-Option (CQO) consistently alleviates this issue, yielding an average accuracy improvement of over 14 percentage points across diverse models and datasets. Our findings identify the architectural origin of prompt-order sensitivity and offer a simple yet effective remedy to enhance model robustness in multiple-choice reasoning tasks.
📝 Abstract
Large language models exhibit surprising sensitivity to the structure of the prompt, but the mechanisms underlying this sensitivity remain poorly understood. In this work, we conduct an in-depth investigation on a striking case: in multiple-choice question answering, placing context before the questions and options (CQO) outperforms the reverse order (QOC) by over 14%p, consistently over a wide range of models and datasets. Through systematic architectural analysis, we identify causal attention as the core mechanism: in QOC prompts, the causal mask prevents option tokens from attending to context, creating an information bottleneck where context becomes invisible to options.