Cards Against LLMs: Benchmarking Humor Alignment in Large Language Models

๐Ÿ“… 2026-04-09
๐Ÿ“ˆ Citations: 0
โœจ Influential: 0
๐Ÿ“„ PDF

career value

198K/year
๐Ÿค– AI Summary
This study addresses the lack of systematic evaluation of large language models (LLMs) in humor comprehension and alignment with human preferences. The authors introduce the first large-scale benchmark for humor alignment by engaging five state-of-the-art LLMs in 9,894 rounds of Cards Against Humanity, assessing their performance on a ten-choice humorous response task against human judgments. Results show that all models significantly outperform random baselines yet exhibit only limited alignment with human preferences. Notably, inter-model agreement substantially exceeds modelโ€“human agreement, suggesting that modelsโ€™ humor judgments may be driven more by shared reasoning structures than by genuine preference alignment. Further analysis uncovers systematic biases, including positional effects and content-based preferences, that influence model selections.

Technology Category

Application Category

๐Ÿ“ Abstract
Humor is one of the most culturally embedded and socially significant dimensions of human communication, yet it remains largely unexplored as a dimension of Large Language Model (LLM) alignment. In this study, five frontier language models play the same Cards Against Humanity games (CAH) as human players. The models select the funniest response from a slate of ten candidate cards across 9,894 rounds. While all models exceed the random baseline, alignment with human preference remains modest. More striking is that models agree with each other substantially more often than they agree with humans. We show that this preference is partly explained by systematic position biases and content preferences, raising the question whether LLM humor judgment reflects genuine preference or structural artifacts of inference and alignment.
Problem

Research questions and friction points this paper is trying to address.

humor alignment
Large Language Models
human preference
Cards Against Humanity
position bias
Innovation

Methods, ideas, or system contributions that make the work stand out.

humor alignment
Large Language Models
Cards Against Humanity
preference bias
human-AI alignment
๐Ÿ”Ž Similar Papers