StandUp4AI: A New Multilingual Dataset for Humor Detection in Stand-up Comedy Videos

📅 2025-05-24
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Existing humor detection models exhibit limited performance on multilingual stand-up comedy videos, particularly in precisely localizing fine-grained punchlines. To address this, we introduce the largest multimodal stand-up comedy dataset to date—spanning seven languages and comprising over 330 hours of video—integrating automated laughter detection with human-annotated word-level labels. For the first time, we formulate humor detection as a word-level sequence labeling task, departing from conventional binary classification. Our method proposes a novel laughter detection paradigm leveraging ASR error augmentation, combined with audio-text-temporal multimodal fusion and sequence modeling (e.g., BERT-CRF) for cross-lingual, context-aware humor localization. We publicly release the dataset and code. Experiments demonstrate substantial improvements in localization accuracy and validate the effectiveness and generalizability of the word-level annotation paradigm in multilingual settings.

Technology Category

Application Category

📝 Abstract
Aiming towards improving current computational models of humor detection, we propose a new multimodal dataset of stand-up comedies, in seven languages: English, French, Spanish, Italian, Portuguese, Hungarian and Czech. Our dataset of more than 330 hours, is at the time of writing the biggest available for this type of task, and the most diverse. The whole dataset is automatically annotated in laughter (from the audience), and the subpart left for model validation is manually annotated. Contrary to contemporary approaches, we do not frame the task of humor detection as a binary sequence classification, but as word-level sequence labeling, in order to take into account all the context of the sequence and to capture the continuous joke tagging mechanism typically occurring in natural conversations. As par with unimodal baselines results, we propose a method for e propose a method to enhance the automatic laughter detection based on Audio Speech Recognition errors. Our code and data are available online: https://tinyurl.com/EMNLPHumourStandUpPublic
Problem

Research questions and friction points this paper is trying to address.

Develop multilingual humor detection in stand-up comedy videos
Create largest diverse dataset with automatic and manual annotations
Propose word-level sequence labeling for continuous joke tagging
Innovation

Methods, ideas, or system contributions that make the work stand out.

Multilingual dataset for humor detection
Word-level sequence labeling approach
Enhanced laughter detection using ASR errors
🔎 Similar Papers
No similar papers found.
Valentin Barriere
Valentin Barriere
Telecom ParisTech
Affective Computing
N
Nahuel Gomez
Universidad de Chile – DIE, Santiago, Chile
L
Léo Hemamou
Without Affiliation, Paris, France
S
Sofia Callejas
INRIA Chile, Santiago, Chile
Brian Ravenet
Brian Ravenet
Université Paris-Saclay
Embodied Conversational AgentSerious gamesVirtual Reality