Building Knowledge from Interactions: An LLM-Based Architecture for Adaptive Tutoring and Social Reasoning

📅 2025-04-02
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address memory limitations, contextual fragmentation, and insufficient personalization of large language models (LLMs) in robot-mediated instruction, this paper proposes a cognitively inspired multimodal memory architecture for human–robot collaborative teaching agents. Methodologically, it integrates multimodal perception interfaces, a hierarchical memory system featuring selective encoding and context-aware retrieval, and a dual-track decision mechanism balancing task execution and social interaction—enabling dynamic goal-directed behavior and naturalistic social engagement. The key contribution is the first integration of embodied memory modeling with LLM-based agents, supporting cross-session knowledge accumulation and generalizable reasoning. Empirical validation via human–robot interaction (HRI) user studies and synthetic data experiments demonstrates that the system autonomously advances training workflows, sustains long-horizon dialogue coherence, and significantly improves task completion rate (+28.6%) and interaction naturalness (p < 0.01).

Technology Category

Application Category

📝 Abstract
Integrating robotics into everyday scenarios like tutoring or physical training requires robots capable of adaptive, socially engaging, and goal-oriented interactions. While Large Language Models show promise in human-like communication, their standalone use is hindered by memory constraints and contextual incoherence. This work presents a multimodal, cognitively inspired framework that enhances LLM-based autonomous decision-making in social and task-oriented Human-Robot Interaction. Specifically, we develop an LLM-based agent for a robot trainer, balancing social conversation with task guidance and goal-driven motivation. To further enhance autonomy and personalization, we introduce a memory system for selecting, storing and retrieving experiences, facilitating generalized reasoning based on knowledge built across different interactions. A preliminary HRI user study and offline experiments with a synthetic dataset validate our approach, demonstrating the system's ability to manage complex interactions, autonomously drive training tasks, and build and retrieve contextual memories, advancing socially intelligent robotics.
Problem

Research questions and friction points this paper is trying to address.

Enhancing LLM-based decision-making in social human-robot interaction
Balancing social conversation with task guidance in robot trainers
Improving autonomy via memory systems for personalized interaction experiences
Innovation

Methods, ideas, or system contributions that make the work stand out.

Multimodal framework enhances LLM decision-making
Memory system stores and retrieves interaction experiences
Balances social conversation with task guidance
🔎 Similar Papers
No similar papers found.