๐ค AI Summary
This work proposes SocialLDG, a novel framework aimed at endowing robots with social intelligence by enabling them to infer usersโ internal states from observed behaviors, predict future actions, and generate appropriate responses. Integrating insights from cognitive science, SocialLDG uniquely combines lexical priors derived from language models with dynamic graph neural networks to explicitly model the temporally evolving relationships among six core tasks, thereby capturing the dynamic mapping between latent internal states and observable behaviors. The framework supports continual learning of new tasks without catastrophic forgetting and offers strong interpretability and scalability. Evaluated on two public humanโrobot social interaction datasets, SocialLDG achieves state-of-the-art performance, revealing the underlying temporal dynamics governing the interplay between internal states and behavioral expressions in social interactions.
๐ Abstract
For a robot to be called socially intelligent, it must be able to infer users internal states from their current behaviour, predict the users future behaviour, and if required, respond appropriately. In this work, we investigate how robots can be endowed with such social intelligence by modelling the dynamic relationship between user's internal states (latent) and actions (observable state). Our premise is that these states arise from the same underlying socio-cognitive process and influence each other dynamically. Drawing inspiration from theories in Cognitive Science, we propose a novel multi-task learning framework, termed as \textbf{SocialLDG} that explicitly models the dynamic relationship among the states represent as six distinct tasks. Our framework uses a language model to introduce lexical priors for each task and employs dynamic graph learning to model task affinity evolving with time. SocialLDG has three advantages: First, it achieves state-of-the-art performance on two challenging human-robot social interaction datasets available publicly. Second, it supports strong task scalability by learning new tasks seamlessly without catastrophic forgetting. Finally, benefiting from explicit modelling task affinity, it offers insights on how different interactions unfolds in time and how the internal states and observable actions influence each other in human decision making.