XR-DT: Extended Reality-Enhanced Digital Twin for Agentic Mobile Robots

📅 2025-12-04

📈 Citations: 0

✨ Influential: 0

career value

218K/year

🤖 AI Summary

To address safety and collaboration bottlenecks in human-robot coexistence arising from weak human perception of robotic reasoning and insufficient trust, this paper proposes XR-DT—a novel Extended Reality (XR)-enhanced Digital Twin framework. XR-DT introduces a hierarchical digital twin architecture and a chain-of-thought prompting mechanism, tightly integrating human intent, dynamic environmental states, and robotic cognitive models. Technically, it unifies Unity-based simulation, diffusion-based policy learning, multimodal large language models, AutoGen-powered multi-agent coordination, and real-time AR feedback via wearable devices. Experiments demonstrate that XR-DT significantly improves joint human-robot trajectory prediction accuracy, enhances interaction robustness, interpretability, and mutual trust in dynamic tasks, and establishes a deployable, bidirectional understanding paradigm for safe, synergistic human-robot coexistence.

Technology Category

Application Category

📝 Abstract

As mobile robots increasingly operate alongside humans in shared workspaces, ensuring safe, efficient, and interpretable Human-Robot Interaction (HRI) has become a pressing challenge. While substantial progress has been devoted to human behavior prediction, limited attention has been paid to how humans perceive, interpret, and trust robots' inferences, impeding deployment in safety-critical and socially embedded environments. This paper presents XR-DT, an eXtended Reality-enhanced Digital Twin framework for agentic mobile robots, that bridges physical and virtual spaces to enable bi-directional understanding between humans and robots. Our hierarchical XR-DT architecture integrates virtual-, augmented-, and mixed-reality layers, fusing real-time sensor data, simulated environments in the Unity game engine, and human feedback captured through wearable AR devices. Within this framework, we design an agentic mobile robot system with a unified diffusion policy for context-aware task adaptation. We further propose a chain-of-thought prompting mechanism that allows multimodal large language models to reason over human instructions and environmental context, while leveraging an AutoGen-based multi-agent coordination layer to enhance robustness and collaboration in dynamic tasks. Initial experimental results demonstrate accurate human and robot trajectory prediction, validating the XR-DT framework's effectiveness in HRI tasks. By embedding human intention, environmental dynamics, and robot cognition into the XR-DT framework, our system enables interpretable, trustworthy, and adaptive HRI.

Problem

Research questions and friction points this paper is trying to address.

Bridging human-robot perception gaps for safe shared workspaces

Enabling bidirectional understanding between humans and agentic robots

Improving interpretability and trust in human-robot interaction systems

Innovation

Methods, ideas, or system contributions that make the work stand out.

Extended Reality-enhanced Digital Twin framework for bidirectional human-robot understanding

Hierarchical XR-DT architecture integrating virtual, augmented, and mixed reality layers

Chain-of-thought prompting mechanism with multimodal large language models for reasoning

🔎 Similar Papers

On the Use of Immersive Digital Technologies for Designing and Operating UAVs

2024-07-23arXiv.orgCitations: 0

Apple

Sunnyvale, United States of America

Research Engineer - Perception and Machine Learning