🤖 AI Summary
This study addresses the limitation of subjective questionnaires or indirect system metrics in evaluating user satisfaction with Social Interaction Agents (SIAs). We propose an unsupervised, real-world-scenario–based approach that automatically classifies satisfaction levels using raw, unlabeled social signals. Specifically, we synchronously capture three modalities—body pose, facial expressions, and interpersonal physical distance—during natural human–SIA interactions, then apply multi-scale temporal feature engineering and time-series machine learning to perform end-to-end binary satisfaction classification. Evaluated on 46 single-user real-world interaction sessions, our model achieves high accuracy in detecting low-satisfaction episodes (F1 > 0.82), significantly outperforming baseline methods. Our key contributions are: (i) the first annotation-free, real-scenario–oriented temporal framework for SIA satisfaction recognition; and (ii) a deployable, real-time automated feedback mechanism for SIAs, advancing human–agent evaluation from static, retrospective assessment toward dynamic, autonomous adaptation.
📝 Abstract
Socially interactive agents (SIAs) are being used in various scenarios and are nearing productive deployment. Evaluating user satisfaction with SIAs'performance is a key factor in designing the interaction between the user and SIA. Currently, subjective user satisfaction is primarily assessed manually through questionnaires or indirectly via system metrics. This study examines the automatic classification of user satisfaction through analysis of social signals, aiming to enhance both manual and autonomous evaluation methods for SIAs. During a field trial at the Deutsches Museum Bonn, a Furhat Robotics head was employed as a service and information hub, collecting an"in-the-wild"dataset. This dataset comprises 46 single-user interactions, including questionnaire responses and video data. Our method focuses on automatically classifying user satisfaction based on time series classification. We use time series of social signal metrics derived from the body pose, time series of facial expressions, and physical distance. This study compares three feature engineering approaches on different machine learning models. The results confirm the method's effectiveness in reliably identifying interactions with low user satisfaction without the need for manually annotated datasets. This approach offers significant potential for enhancing SIA performance and user experience through automated feedback mechanisms.