🤖 AI Summary
This work investigates whether large language models (LLMs) exhibit human-like framing effects—systematic response differences to positive versus negative formulations of identical factual content—in natural text. To this end, we introduce WildFrame, the first real-world, dual-frame contrastive dataset (1,000 instances), curated from authentic corpora and accompanied by human judgments and zero-shot/few-shot responses from eight state-of-the-art LLMs. Our study provides the first systematic, in-context comparison of framing sensitivity between humans and LLMs. Results show both humans and models exhibit significant preference for positively framed statements, with response correlations reaching *r* ≥ 0.57. These findings empirically confirm that LLMs possess human-like framing bias, establishing the first evidence-based foundation for trustworthy AI, controllable text generation, and bias mitigation. Moreover, WildFrame introduces a novel evaluation paradigm for probing cognitive biases in language models.
📝 Abstract
Humans are influenced by how information is presented, a phenomenon known as the framing effect. Previous work has shown that LLMs may also be susceptible to framing but has done so on synthetic data and did not compare to human behavior. We introduce WildFrame, a dataset for evaluating LLM responses to positive and negative framing, in naturally-occurring sentences, and compare humans on the same data. WildFrame consists of 1,000 texts, first selecting real-world statements with clear sentiment, then reframing them in either positive or negative light, and lastly, collecting human sentiment annotations. By evaluating eight state-of-the-art LLMs on WildFrame, we find that all models exhibit framing effects similar to humans ($rgeq0.57$), with both humans and models being more influenced by positive rather than negative reframing. Our findings benefit model developers, who can either harness framing or mitigate its effects, depending on the downstream application.