Do Large Language Models Plan Answer Positions? Position Bias in Multiple-Choice Question Generation

📅 2026-05-03

📈 Citations: 0

✨ Influential: 0

career value

183K/year

🤖 AI Summary

This study addresses a pervasive systematic bias in large language models (LLMs) that manifests as uneven answer-position distributions when generating multiple-choice questions, thereby compromising item quality and assessment reliability. By analyzing the generation behavior of ten LLMs and five vision-language models across three tasks—and complementing this with representation probing experiments—the work provides the first evidence that models implicitly plan answer positions during generation. Building on this insight, the authors propose an intervention based on activation steering to effectively modulate answer-position preferences, significantly improving distribution uniformity. Beyond uncovering an important aspect of model internals, this research offers a practical framework for controllable and fair question generation.

📝 Abstract

Large language models (LLMs) are increasingly used to generate multiple-choice questions (MCQs), where correct answers should ideally be uniformly distributed across options. However, we observe that LLMs exhibit systematic position biases during generation. Through extensive experiments with 10 LLMs and 5 vision-language models (VLMs) on three MCQ generation tasks, we show that these biases are structured, with similar patterns emerging within model families. To investigate the underlying mechanisms, we conduct probing experiments and find that hidden representations in the question stem encode predictive signals of the correct answer position, suggesting that answer position may be implicitly planned during generation. Building on this insight, we apply activation steering to manipulate internal representations and influence answer position. Our results show that steering can partially control positional preferences and substantially shift answer position distributions. Our findings provide a practical framework for studying implicit positional planning in LLMs and highlight the importance of controllable generation for reliable MCQ construction and evaluation.

Problem

Research questions and friction points this paper is trying to address.

position bias

multiple-choice question generation

large language models

answer position

Innovation

Methods, ideas, or system contributions that make the work stand out.

position bias

activation steering

multiple-choice question generation

implicit planning