EmoSpace: Fine-Grained Emotion Prototype Learning for Immersive Affective Content Generation

📅 2026-02-12
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Existing generative methods struggle to model fine-grained emotional semantics, limiting affective expression in immersive VR content. This work proposes an unsupervised emotion-aware generation framework that operates without explicit emotion labels. By leveraging vision–language alignment, the model learns dynamic and interpretable hierarchical emotion prototypes. Fine-grained emotional control is achieved through multi-prototype guidance, temporal emotion fusion, and attention reweighting mechanisms. The approach significantly outperforms current methods across tasks including emotion-conditioned image outpainting, stylized generation, and VR panoramic synthesis. Furthermore, user studies reveal a unique amplification effect of VR environments on emotional perception, underscoring the framework’s effectiveness in immersive settings.

Technology Category

Application Category

📝 Abstract
Emotion is important for creating compelling virtual reality (VR) content. Although some generative methods have been applied to lower the barrier to creating emotionally rich content, they fail to capture the nuanced emotional semantics and the fine-grained control essential for immersive experiences. To address these limitations, we introduce EmoSpace, a novel framework for emotion-aware content generation that learns dynamic, interpretable emotion prototypes through vision-language alignment. We employ a hierarchical emotion representation with rich learnable prototypes that evolve during training, enabling fine-grained emotional control without requiring explicit emotion labels. We develop a controllable generation pipeline featuring multi-prototype guidance, temporal blending, and attention reweighting that supports diverse applications, including emotional image outpainting, stylized generation, and emotional panorama generation for VR environments. Our experiments demonstrate the superior performance of EmoSpace over existing methods in both qualitative and quantitative evaluations. Additionally, we present a comprehensive user study investigating how VR environments affect emotional perception compared to desktop settings. Our work facilitates immersive visual content generation with fine-grained emotion control and supports applications like therapy, education, storytelling, artistic creation, and cultural preservation. Code and models will be made publicly available.
Problem

Research questions and friction points this paper is trying to address.

emotion
virtual reality
fine-grained control
affective content generation
immersive experience
Innovation

Methods, ideas, or system contributions that make the work stand out.

emotion prototype learning
vision-language alignment
fine-grained emotional control
controllable generation
immersive VR content
🔎 Similar Papers
Bingyuan Wang
Bingyuan Wang
The Hong Kong University of Science and Technology (Guangzhou)
Generative AIAffective ComputingImmersive StorytellingCreative IntelligenceCultural Heritage
X
Xingbei Chen
The Hong Kong University of Science and Technology (Guangzhou)
Z
Zongyang Qiu
The Hong Kong University of Science and Technology (Guangzhou)
L
Linping Yuan
The Hong Kong University of Science and Technology
Zeyu Wang
Zeyu Wang
The Hong Kong University of Science and Technology (Guangzhou)
Computer GraphicsHuman-Computer InteractionCreative IntelligenceCultural Heritage