🤖 AI Summary
In human-robot shared environments, the difficulty of interpreting robotic motion intent often leads to interaction barriers and safety concerns. This study addresses this challenge by conducting an online video experiment to systematically compare, for the first time, the effectiveness of implicit expressive motions versus explicit signals—such as lights, text, and audio—used by a quadrupedal robot (Spot) to convey navigation intent. It further investigates how congruent or conflicting multimodal cues influence users’ prediction accuracy, confidence in judgment, and trust. Results demonstrate that expressive motion can match or even surpass conventional explicit communication methods in specific scenarios, offering empirical evidence and design guidance for enhancing the legibility of robotic behavior.
📝 Abstract
Robots in shared spaces often move in ways that are difficult for people to interpret, placing the burden on humans to adapt. High-DoF robots exhibit motion that people read as expressive, intentionally or not, making it important to understand how such cues are perceived. We present an online video study evaluating how different signaling modalities, expressive motion, lights, text, and audio, shape people's ability to understand a quadruped robot's upcoming navigation actions (Boston Dynamics Spot). Across four common scenarios, we measure how each modality influences humans' (1) accuracy in predicting the robot's next navigation action, (2) confidence in that prediction, and (3) trust in the robot to act safely. The study tests how expressive motions compare to explicit channels, whether aligned multimodal cues enhance interpretability, and how conflicting cues affect user confidence and trust. We contribute initial evidence on the relative effectiveness of implicit versus explicit signaling strategies.