SMILE-Next: Teaching Large Language Models to Detect, Classify, and Reason about Laughter

📅 2026-05-27

📈 Citations: 0

✨ Influential: 0

career value

168K/year

🤖 AI Summary

This study addresses the limited understanding of the multidimensional social semantics of laughter in real-world scenarios, which existing research often treats within the confines of single-task frameworks. To bridge this gap, the authors introduce SMILE-Next, a multimodal dataset encompassing three core tasks: laughter detection, type classification, and social inference. They further propose a specialized large language model framework that leverages a laughter-guided Self-Instruct approach to automatically generate diverse instructions, thereby enhancing model generalization. Additionally, a Mixture-of-Laughter-Experts (MoLE) architecture is designed to enable task-adaptive dynamic routing. Experimental results demonstrate that the proposed method significantly outperforms current baselines on real-world laughter understanding tasks, achieving notable improvements in both performance and computational efficiency.

📝 Abstract

Laughter is a complex social signal that conveys communicative intent beyond amusement. While prior work has focused on isolated laughter analysis tasks, a comprehensive understanding of laughter in real-world scenarios remains underexplored. Therefore, we introduce SMILE-Next, a dataset for real-world laughter understanding with multimodal textual representations and question-answer annotations across three tasks: laughter detection, laughter type classification, and laughter reasoning. Building upon SMILE-Next, we aim to develop a laughter-specialized large language model capable of nuanced understanding of laughter in real-world contexts. To this end, we propose two key components: laughter-specific Self-Instruct and the Mixture-of-Laugh-Experts (MoLE) framework. Laughter-specific Self-Instruct enhances generalization across tasks and domains by automatically synthesizing diverse laughter-centric instructions. MoLE introduces a task-adaptive expert routing mechanism that dynamically selects specialized experts tailored to each laughter-related task, improving task-specific performance and efficiency. Experimental results show that the combination of our proposed components substantially outperforms multimodal LLM baselines, advancing robust real-world laughter understanding. Project page is at: https://mok0102.github.io/smile-next/.

Problem

Research questions and friction points this paper is trying to address.

laughter understanding

multimodal LLM

social signal

real-world laughter

laughter reasoning

Innovation

Methods, ideas, or system contributions that make the work stand out.

laughter understanding

Self-Instruct

Mixture-of-Experts