HARMONI: Multimodal Personalization of Multi-User Human-Robot Interactions with LLMs

📅 2026-01-27

📈 Citations: 0

✨ Influential: 0

career value

206K/year

🤖 AI Summary

Existing multi-user human-AI interaction systems struggle to achieve long-term, dynamic personalization, limiting the effectiveness of socially assistive services. To address this challenge, this work proposes HARMONI, a novel framework that enables continuous, multi-user personalized interaction for the first time. HARMONI integrates multimodal perception, dynamic user profiling, contextual environment modeling, and ethical alignment mechanisms, leveraging large language models to generate context-aware responses. Evaluated on four benchmark datasets and through an in-situ user study in a nursing home, HARMONI significantly outperforms existing approaches in user modeling accuracy, personalization quality, and user satisfaction, thereby overcoming the limitations of static, single-user personalization paradigms.

Technology Category

Application Category

📝 Abstract

Existing human-robot interaction systems often lack mechanisms for sustained personalization and dynamic adaptation in multi-user environments, limiting their effectiveness in real-world deployments. We present HARMONI, a multimodal personalization framework that leverages large language models to enable socially assistive robots to manage long-term multi-user interactions. The framework integrates four key modules: (i) a perception module that identifies active speakers and extracts multimodal input; (ii) a world modeling module that maintains representations of the environment and short-term conversational context; (iii) a user modeling module that updates long-term speaker-specific profiles; and (iv) a generation module that produces contextually grounded and ethically informed responses. Through extensive evaluation and ablation studies on four datasets, as well as a real-world scenario-driven user-study in a nursing home environment, we demonstrate that HARMONI supports robust speaker identification, online memory updating, and ethically aligned personalization, outperforming baseline LLM-driven approaches in user modeling accuracy, personalization quality, and user satisfaction.

Problem

Research questions and friction points this paper is trying to address.

human-robot interaction

personalization

multi-user environments

dynamic adaptation

sustained personalization

Innovation

Methods, ideas, or system contributions that make the work stand out.

multimodal personalization

large language models

multi-user human-robot interaction