🤖 AI Summary
This work addresses the limited reliability of embodied intelligence in scenarios involving perceptual noise, ambiguous instructions, and complex human–robot interaction by proposing a novel reliability framework centered on dynamic, subjective human interaction, moving away from traditional formal verification approaches. The core of this framework lies in constructing and continuously updating an accessible “explicit world model” that serves as a shared cognitive basis between humans and AI. Integrated with multimodal perception and intention alignment mechanisms, this model ensures that the agent’s behaviors remain consistent with human goals and expectations. Empirical results demonstrate that the proposed approach significantly enhances the reliability, explainability, and predictability of robotic collaborative behaviors in social, multimodal, and dynamically evolving environments.
📝 Abstract
This paper addresses the topic of robustness under sensing noise, ambiguous instructions, and human-robot interaction. We take a radically different tack to the issue of reliable embodied AI: instead of focusing on formal verification methods aimed at achieving model predictability and robustness, we emphasise the dynamic, ambiguous and subjective nature of human-robot interactions that requires embodied AI systems to perceive, interpret, and respond to human intentions in a manner that is consistent, comprehensible and aligned with human expectations. We argue that when embodied agents operate in human environments that are inherently social, multimodal, and fluid, reliability is contextually determined and only has meaning in relation to the goals and expectations of humans involved in the interaction. This calls for a fundamentally different approach to achieving reliable embodied AI that is centred on building and updating an accessible"explicit world model"representing the common ground between human and AI, that is used to align robot behaviours with human expectations.