LLM-Driven Augmented Reality Puppeteer: Controller-Free Voice-Commanded Robot Teleoperation

📅 2025-02-13
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This study addresses the limitations of conventional robotic teleoperation—namely, its reliance on physical controllers, unintuitive human–robot interaction, and constrained safety guarantees. We propose a novel controller-free, voice-driven teleoperation paradigm that integrates augmented reality (AR) with large language models (LLMs). Users issue natural-language voice commands within an AR environment rendered on the Meta Quest 3 headset to manipulate a virtual robot avatar; the LLM interprets semantic intent in real time, generates structured action commands, and—via a safety-constrained mapping module—translates them into executable control signals for a physical robot. To our knowledge, this is the first work to realize an end-to-end, LLM-driven closed-loop control pipeline spanning speech input, AR-based virtual embodiment, and physical robot execution, while ensuring operational safety. A functional prototype has been implemented and validated through user studies, confirming the feasibility and usability of the proposed framework.

Technology Category

Application Category

📝 Abstract
The integration of robotics and augmented reality (AR) presents transformative opportunities for advancing human-robot interaction (HRI) by improving usability, intuitiveness, and accessibility. This work introduces a controller-free, LLM-driven voice-commanded AR puppeteering system, enabling users to teleoperate a robot by manipulating its virtual counterpart in real time. By leveraging natural language processing (NLP) and AR technologies, our system -- prototyped using Meta Quest 3 -- eliminates the need for physical controllers, enhancing ease of use while minimizing potential safety risks associated with direct robot operation. A preliminary user demonstration successfully validated the system's functionality, demonstrating its potential for safer, more intuitive, and immersive robotic control.
Problem

Research questions and friction points this paper is trying to address.

Enhancing human-robot interaction with AR
Controller-free robot teleoperation via voice
Improving safety and usability in robotics
Innovation

Methods, ideas, or system contributions that make the work stand out.

LLM-driven voice-commanded system
AR-based robot teleoperation
Controller-free human-robot interaction
🔎 Similar Papers
No similar papers found.
Y
Yuchong Zhang
KTH Royal Institute of Technology Stockholm, Sweden
B
Bastian Orthmann
KTH Royal Institute of Technology Stockholm, Sweden
Michael C. Welle
Michael C. Welle
Postdoctoral researcher, KTH Royal Institute of Technology
Machine LearningRobotics
J
Jonne van Haastregt
KTH Royal Institute of Technology Stockholm, Sweden
Danica Kragic
Danica Kragic
Professor of Computer Science, KTH - Royal Institute of Technology
roboticsAIrobot visionrobot learning