HumorGen: Cognitive Synergy for Humor Generation in Large Language Models via Persona-Based Distillation

📅 2026-03-19
📈 Citations: 1
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the challenge that large language models struggle to generate high-quality humor due to a fundamental misalignment between their training objectives and the incongruity and surprise essential to humor. To overcome this, the authors propose a cognitive synergy framework that integrates psychological theories of humor into data construction for the first time. They synthesize diverse humorous content using a Mixture-of-Thought strategy driven by six cognitive personas—such as the Absurdist and the Cynic—and employ persona-guided data distillation followed by supervised fine-tuning to train HumorGen, a 7B-parameter model. Experiments demonstrate that this approach significantly outperforms larger instruction-tuned baselines, achieves state-of-the-art performance among open-source models, and rivals leading closed-source systems, thereby validating the critical role of cognitive synergy in data distillation for effective humor generation.
📝 Abstract
Humor generation poses a significant challenge for Large Language Models (LLMs), because their standard training objective - predicting the most likely next word - inherently conflicts with the surprise and incongruity needed for comedy. To bridge this gap, we introduce the Cognitive Synergy Framework, a theoretically grounded methodology for generating high-quality humor data inspired by psychological theories of humor. Utilizing a Mixture-of-Thought (MoT) approach, we deploy six cognitive personas (e.g., The Absurdist, The Cynic) to synthesize diverse comedic perspectives for a given prompt. This framework creates a theoretically grounded dataset, which we use to fine-tune a 7B-parameter student model. We compare Direct Preference Optimization (DPO) and a novel Offline Group Relative Policy Optimization (O-GRPO); our 7B model significantly outperforms larger instruction-tuned baselines and achieves performance competitive with state-of-the-art proprietary models. We find that cognitive-driven data curation is far more critical than alignment algorithms or model scale for humor generation. Code and data will be available upon publication.
Problem

Research questions and friction points this paper is trying to address.

humor generation
large language models
cognitive synergy
incongruity
surprise
Innovation

Methods, ideas, or system contributions that make the work stand out.

Cognitive Synergy Framework
Mixture-of-Thought
Persona-Based Distillation
Humor Generation
Theory-Grounded Dataset
🔎 Similar Papers
No similar papers found.