An LLM-Assisted Toolkit for Inspectable Multimodal Emotion Data Annotation

📅 2026-03-02

📈 Citations: 0

✨ Influential: 0

career value

199K/year

🤖 AI Summary

This work addresses the challenge of scaling fine-grained, evidence-based annotation in multimodal emotion recognition, where dynamic and misaligned cross-modal cues complicate consistent labeling. To this end, the authors propose a traceable, event-centric multimodal emotion annotation toolkit that first aligns heterogeneous data through preprocessing, then visualizes multimodal signals on an interactive timeline. The system integrates large language models (LLMs) with modality-specific prompt templates to generate structured emotion annotations for human verification. By pioneering the integration of LLMs with traceable event bundles, the approach enables cross-modal consistency checks, substantially improving annotation efficiency and interpretability. Experiments on a VR-based multimodal emotion dataset demonstrate the effectiveness of the proposed workflow, yielding high-quality, structured emotion labels.

Technology Category

Application Category

📝 Abstract

Multimodal Emotion Recognition (MER) increasingly depends on fine grained, evidence grounded annotations, yet inspection and label construction are hard to scale when cues are dynamic and misaligned across modalities. We present an LLM-assisted toolkit that supports multimodal emotion data annotation through an inspectable, event centered workflow. The toolkit preprocesses and aligns heterogeneous recordings, visualizes all modalities on an interactive shared timeline, and renders structured signals as video tracks for cross modal consistency checks. It then detects candidate events and packages synchronized keyframes and time windows as event packets with traceable pointers to the source data. Finally, the toolkit integrates an LLM with modality specific tools and prompt templates to draft structured annotations for analyst verification and editing. We demonstrate the workflow on multimodal VR emotion recordings with representative examples.

Problem

Research questions and friction points this paper is trying to address.

Multimodal Emotion Recognition

emotion annotation

cross-modal alignment

inspectable labeling

dynamic cues

Innovation

Methods, ideas, or system contributions that make the work stand out.

LLM-assisted annotation

multimodal emotion recognition

inspectable workflow

event-centered annotation

cross-modal alignment

🔎 Similar Papers

OV-MER: Towards Open-Vocabulary Multimodal Emotion Recognition

2024-10-02Citations: 0