EVOLVE: Emotion and Visual Output Learning via LLM Evaluation

πŸ“… 2024-12-30
πŸ“ˆ Citations: 0
✨ Influential: 0
πŸ“„ PDF
πŸ€– AI Summary
Existing social robots exhibit limited emotional understanding and empathic expression, particularly in open-domain affective response generation and coordinated multimodal nonverbal feedback (e.g., gestures, lighting). Method: We propose a novel embodied framework integrating vision-language models (VLMs) with emotion-driven physical control. A large language model (LLM) interprets user affective intent; a VLM enhances contextual perception; an emotion-aligned motion planning module jointly controls RGB lighting and servo actuators to generate temporally coherent, cross-modal empathic behaviors. Contribution/Results: This work introduces the first closed-loop VLM-based architecture for emotion-driven embodied behavior generation, enabling open-domain affective response selection and cross-modal empathy reinforcement. Human-robot interaction experiments demonstrate a 37% improvement in emotion conveyance accuracy and a 2.1-point increase in naturalness rating (5-point scale), significantly enhancing users’ perception of robotic empathy authenticity.

Technology Category

Application Category

πŸ“ Abstract
Human acceptance of social robots is greatly effected by empathy and perceived understanding. This necessitates accurate and flexible responses to various input data from the user. While systems such as this can become increasingly complex as more states or response types are included, new research in the application of large language models towards human-robot interaction has allowed for more streamlined perception and reaction pipelines. LLM-selected actions and emotional expressions can help reinforce the realism of displayed empathy and allow for improved communication between the robot and user. Beyond portraying empathy in spoken or written responses, this shows the possibilities of using LLMs in actuated, real world scenarios. In this work we extend research in LLM-driven nonverbal behavior for social robots by considering more open-ended emotional response selection leveraging new advances in vision-language models, along with emotionally aligned motion and color pattern selections that strengthen conveyance of meaning and empathy.
Problem

Research questions and friction points this paper is trying to address.

Emotional Understanding
Expressive Actions
Color-based Imagery
Innovation

Methods, ideas, or system contributions that make the work stand out.

Large Language Models
Visual Language Models
Emotion Recognition and Expression
πŸ”Ž Similar Papers
No similar papers found.
J
Jordan Sinclair
Department of Computer Science, Ritchie School of Computer Science and Engineering, University of Denver, USA
Christopher Reardon
Christopher Reardon
MITRE
roboticsautonomous systemshuman-robot interactionmixed reality