Positional Bias in Binary Question Answering: How Uncertainty Shapes Model Preferences

📅 2025-06-30

📈 Citations: 0

✨ Influential: 0

career value

171K/year

🤖 AI Summary

This work identifies a positional bias in binary question-answering by large language models (LLMs): models systematically favor options based on their position (e.g., “A” vs. “B”) in the answer list, with bias intensity growing exponentially with answer uncertainty. To quantify this effect, the authors construct a multi-level uncertainty benchmark via progressive context removal and SQuAD-it–based distractor injection, validated on high-uncertainty datasets including WebGPT and Winning Arguments. They propose two novel metrics—Preference Fairness and Position Consistency—to assess decision fairness and order robustness. Experiments across five state-of-the-art LLMs confirm that positional bias vanishes under low uncertainty but escalates sharply with task difficulty. This study establishes, for the first time, a quantifiable relationship between answer uncertainty and positional bias, providing both a theoretical framework and practical tools for evaluating LLM fairness.

Technology Category

Application Category

📝 Abstract

Positional bias in binary question answering occurs when a model systematically favors one choice over another based solely on the ordering of presented options. In this study, we quantify and analyze positional bias across five large language models under varying degrees of answer uncertainty. We re-adapted the SQuAD-it dataset by adding an extra incorrect answer option and then created multiple versions with progressively less context and more out-of-context answers, yielding datasets that range from low to high uncertainty. Additionally, we evaluate two naturally higher-uncertainty benchmarks: (1) WebGPT - question pairs with unequal human-assigned quality scores, and (2) Winning Arguments - where models predict the more persuasive argument in Reddit's r/ChangeMyView exchanges. Across each dataset, the order of the "correct" (or higher-quality/persuasive) option is systematically flipped (first placed in position 1, then in position 2) to compute both Preference Fairness and Position Consistency. We observe that positional bias is nearly absent under low-uncertainty conditions, but grows exponentially when it becomes doubtful to decide which option is correct.

Problem

Research questions and friction points this paper is trying to address.

Quantify positional bias in binary question answering models

Analyze bias under varying uncertainty levels in datasets

Evaluate bias in high-uncertainty benchmarks like WebGPT

Innovation

Methods, ideas, or system contributions that make the work stand out.

Quantify positional bias in binary QA models

Adapt SQuAD-it dataset with incorrect options

Evaluate bias using Preference Fairness metrics

🔎 Similar Papers

No similar papers found.