One Joke to Rule them All? On the (Im)possibility of Generalizing Humor

📅 2025-08-26

📈 Citations: 0

✨ Influential: 0

career value

201K/year

🤖 AI Summary

This study investigates whether large language models (LLMs) can achieve cross-type generalization in humor understanding—critical for adapting to rapidly evolving online humor forms (e.g., memes, anti-humor, AI failure jokes)—through targeted training on specific humor tasks. Method: We adopt a cross-dataset transfer learning paradigm, systematically evaluating generalization across four heterogeneous humor tasks. Contribution/Results: Experiments reveal that “dad jokes” serve as the most transferable source task, uncovering an asymmetric transferability structure among humor types. Our approach achieves up to 75% accuracy on unseen humor tasks. Introducing training data diversity improves transfer performance by 1.88–4.05% without degrading source-domain accuracy. This work provides the first empirical identification of a highly transferable humor subclass, offering both theoretical foundations and practical guidance for developing robust, general-purpose computational humor models.

Technology Category

Application Category

📝 Abstract

Humor is a broad and complex form of communication that remains challenging for machines. Despite its broadness, most existing research on computational humor traditionally focused on modeling a specific type of humor. In this work, we wish to understand whether competence on one or more specific humor tasks confers any ability to transfer to novel, unseen types; in other words, is this fragmentation inevitable? This question is especially timely as new humor types continuously emerge in online and social media contexts (e.g., memes, anti-humor, AI fails). If Large Language Models (LLMs) are to keep up with this evolving landscape, they must be able to generalize across humor types by capturing deeper, transferable mechanisms. To investigate this, we conduct a series of transfer learning experiments across four datasets, representing different humor tasks. We train LLMs under varied diversity settings (1-3 datasets in training, testing on a novel task). Experiments reveal that models are capable of some transfer, and can reach up to 75% accuracy on unseen datasets; training on diverse sources improves transferability (1.88-4.05%) with minimal-to-no drop in in-domain performance. Further analysis suggests relations between humor types, with Dad Jokes surprisingly emerging as the best enabler of transfer (but is difficult to transfer to). We release data and code.

Problem

Research questions and friction points this paper is trying to address.

Investigating humor generalization across diverse computational tasks

Assessing transfer learning capabilities between different humor types

Exploring LLM adaptability to evolving humor in digital contexts

Innovation

Methods, ideas, or system contributions that make the work stand out.

Transfer learning across humor datasets

Training LLMs on diverse humor sources

Analyzing inter-humor-type transfer relationships

🔎 Similar Papers

No similar papers found.