🤖 AI Summary
This study addresses the challenge of implicit nonverbal emotion perception by multimodal large language models (MLLMs) in high-emotion-sensitivity domains such as healthcare and education. We propose an empathy-aware prompting framework that operates without explicit emotion labels. Methodologically, real-time facial expression recognition—using a commercial API—is employed to extract user affective features, which are then integrated into the prompting pipeline of a locally deployed DeepSeek-LLM via lightweight contextual embeddings, requiring no architectural or training modifications. Key contributions include: (1) the first seamless integration of label-free emotion perception with LLM-based dialogue generation; (2) a modular design enabling straightforward extension to other nonverbal modalities (e.g., prosody, gesture); and (3) preliminary user evaluation (N=5) demonstrating consistent affective alignment in model responses and significant improvements in conversational naturalness and fluency.
📝 Abstract
We present Empathic Prompting, a novel framework for multimodal human-AI interaction that enriches Large Language Model (LLM) conversations with implicit non-verbal context. The system integrates a commercial facial expression recognition service to capture users'emotional cues and embeds them as contextual signals during prompting. Unlike traditional multimodal interfaces, empathic prompting requires no explicit user control; instead, it unobtrusively augments textual input with affective information for conversational and smoothness alignment. The architecture is modular and scalable, allowing integration of additional non-verbal modules. We describe the system design, implemented through a locally deployed DeepSeek instance, and report a preliminary service and usability evaluation (N=5). Results show consistent integration of non-verbal input into coherent LLM outputs, with participants highlighting conversational fluidity. Beyond this proof of concept, empathic prompting points to applications in chatbot-mediated communication, particularly in domains like healthcare or education, where users'emotional signals are critical yet often opaque in verbal exchanges.