ARIS: Agentic and Relationship Intelligence System for Social Robots

📅 2026-05-01

📈 Citations: 0

✨ Influential: 0

career value

239K/year

🤖 AI Summary

Current social robots exhibit limitations in multi-turn interactions, social relationship reasoning, and long-context dialogue. This work proposes ARIS—a modular agent framework that explicitly constructs a user-centric social relationship knowledge graph and integrates multimodal reasoning (speech, vision, and action) with retrieval-augmented generation (RAG) to enable cross-session user identification and low-latency, highly relevant dialogue. The system employs a scalable RAG pipeline and structured API integration to support large-scale context understanding. A user study (N=23) demonstrates that ARIS significantly outperforms large language model baselines in perceived intelligence, vividness, anthropomorphism, and user preference.

📝 Abstract

Foundational models have advanced social robotics, enabling richer perception and communicative interaction with users. However, current systems still struggle with multi-turn engagement, social-relationship reasoning, and contextually grounded dialogue at scale. We present ARIS (Agentic and Relationship Intelligence System), an agentic AI framework that unifies multimodal reasoning, a graph-based Social World Model, and retrieval-augmented generation (RAG) within a single modular architecture for social robots. We evaluate ARIS with the Pepper robot in a robot-mediated dyadic conversational setting, comparing it against a large language model baseline. A user study (N=23) shows that ARIS yields significantly higher perceived intelligence, animacy, anthropomorphism, and likeability. Our contributions are threefold: (1)~a Social World Model that explicitly maps and updates social relationships between users through a knowledge graph, enabling social reasoning and re-identification across encounters; (2)~an efficient RAG-based conversational pipeline that maintains bounded latency as dialogue histories grow to thousands of exchanges while preserving response relevance; and (3)~system integration and empirical validation of these components within a modular agentic architecture that coordinates speech, vision, and physical action through structured APIs. The implementation of ARIS will be released as open source upon publication.

Problem

Research questions and friction points this paper is trying to address.

social robotics

multi-turn engagement

social-relationship reasoning

contextually grounded dialogue

Innovation

Methods, ideas, or system contributions that make the work stand out.

Social World Model

Retrieval-Augmented Generation

Agentic Architecture

Multimodal Reasoning