Is AI fun? HumorDB: a curated dataset and benchmark to investigate graphical humor

📅 2024-06-19

🏛️ arXiv.org

📈 Citations: 2

✨ Influential: 0

career value

175K/year

🤖 AI Summary

Current AI systems exhibit substantial deficits in text-free, purely visual humor understanding—particularly for image-based humor requiring commonsense reasoning—relative to human performance. To address this gap, we introduce HumorDB, the first high-quality, publicly available dataset for visual humor comprehension, comprising meticulously annotated contrastive image pairs supporting three evaluation paradigms: binary classification, continuous rating (1–10 scale), and pairwise preference ranking. We propose the first tripartite evaluation framework explicitly modeling the subjectivity, continuity, and relativity inherent in visual humor perception. Using HumorDB as a benchmark, we systematically evaluate both vision-only and LLM-augmented multimodal models. Results demonstrate that pure vision models achieve limited performance, whereas language-guided multimodal approaches significantly improve zero-shot and supervised humor recognition. HumorDB establishes the first open, reproducible zero-shot benchmark for text-free visual humor understanding.

Technology Category

Application Category

📝 Abstract

Despite significant advancements in computer vision, understanding complex scenes, particularly those involving humor, remains a substantial challenge. This paper introduces HumorDB, a novel image-only dataset specifically designed to advance visual humor understanding. HumorDB consists of meticulously curated image pairs with contrasting humor ratings, emphasizing subtle visual cues that trigger humor and mitigating potential biases. The dataset enables evaluation through binary classification(Funny or Not Funny), range regression(funniness on a scale from 1 to 10), and pairwise comparison tasks(Which Image is Funnier?), effectively capturing the subjective nature of humor perception. Initial experiments reveal that while vision-only models struggle, vision-language models, particularly those leveraging large language models, show promising results. HumorDB also shows potential as a valuable zero-shot benchmark for powerful large multimodal models. We open-source both the dataset and code under the CC BY 4.0 license.

Problem

Research questions and friction points this paper is trying to address.

Evaluating AI's ability to understand graphical humor.

Assessing vision models' performance on humor classification tasks.

Identifying gaps between AI and human humor comprehension.

Innovation

Methods, ideas, or system contributions that make the work stand out.

Controlled dataset for visual humor evaluation

Minimally contrastive pairs for subtle humor analysis

Attention maps and interpretability studies for humor processing

🔎 Similar Papers

No similar papers found.