ManzaiSet: A Multimodal Dataset of Viewer Responses to Japanese Manzai Comedy

📅 2025-10-20
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Persistent Western-centric biases in affective computing and a lack of multimodal audience response data for Japanese *manzai* (comic duo) performances hinder cross-cultural emotion modeling. Method: We introduce the first large-scale, real-world multimodal dataset of audience reactions to professional *manzai*, comprising synchronized facial video and audio recordings from 241 participants. Using K-means clustering, individual slope analysis, automated humor classification, and FDR-corrected statistical inference, we identify stable audience response patterns. Contribution/Results: We discover three robust viewer archetypes—Stable Appreciators (72.8%), Fluctuating Decliners (13.2%), and Fluctuating Enhancers (14.0%)—empirically refuting the fatigue hypothesis. Our findings reveal cultural heterogeneity in emotional responses and demonstrate a sequence-enhancement effect in non-Western audiences. This dataset establishes a new benchmark and theoretical foundation for cross-cultural affective AI and personalized entertainment systems.

Technology Category

Application Category

📝 Abstract
We present ManzaiSet, the first large scale multimodal dataset of viewer responses to Japanese manzai comedy, capturing facial videos and audio from 241 participants watching up to 10 professional performances in randomized order (94.6 percent watched >= 8; analyses focus on n=228). This addresses the Western centric bias in affective computing. Three key findings emerge: (1) k means clustering identified three distinct viewer types: High and Stable Appreciators (72.8 percent, n=166), Low and Variable Decliners (13.2 percent, n=30), and Variable Improvers (14.0 percent, n=32), with heterogeneity of variance (Brown Forsythe p < 0.001); (2) individual level analysis revealed a positive viewing order effect (mean slope = 0.488, t(227) = 5.42, p < 0.001, permutation p < 0.001), contradicting fatigue hypotheses; (3) automated humor classification (77 instances, 131 labels) plus viewer level response modeling found no type wise differences after FDR correction. The dataset enables culturally aware emotion AI development and personalized entertainment systems tailored to non Western contexts.
Problem

Research questions and friction points this paper is trying to address.

Addressing Western-centric bias in affective computing research
Analyzing multimodal viewer responses to Japanese comedy performances
Identifying distinct viewer types and their response patterns
Innovation

Methods, ideas, or system contributions that make the work stand out.

Large multimodal dataset of viewer responses
K-means clustering identified three viewer types
Automated humor classification with response modeling
🔎 Similar Papers
No similar papers found.
Kazuki Kawamura
Kazuki Kawamura
The University of Tokyo, Sony, Sony CSL Kyoto
Machine LearningAI-Guided LearningHuman-Computer InteractionHuman Augmentation
K
Kengo Nakai
Yoshimoto Kogyo Holdings Co., Ltd.
J
Jun Rekimoto
Sony CSL Kyoto, The University of Tokyo