🤖 AI Summary
This study addresses the limitations of conventional robotic teleoperation—namely, its reliance on physical controllers, unintuitive human–robot interaction, and constrained safety guarantees. We propose a novel controller-free, voice-driven teleoperation paradigm that integrates augmented reality (AR) with large language models (LLMs). Users issue natural-language voice commands within an AR environment rendered on the Meta Quest 3 headset to manipulate a virtual robot avatar; the LLM interprets semantic intent in real time, generates structured action commands, and—via a safety-constrained mapping module—translates them into executable control signals for a physical robot. To our knowledge, this is the first work to realize an end-to-end, LLM-driven closed-loop control pipeline spanning speech input, AR-based virtual embodiment, and physical robot execution, while ensuring operational safety. A functional prototype has been implemented and validated through user studies, confirming the feasibility and usability of the proposed framework.
📝 Abstract
The integration of robotics and augmented reality (AR) presents transformative opportunities for advancing human-robot interaction (HRI) by improving usability, intuitiveness, and accessibility. This work introduces a controller-free, LLM-driven voice-commanded AR puppeteering system, enabling users to teleoperate a robot by manipulating its virtual counterpart in real time. By leveraging natural language processing (NLP) and AR technologies, our system -- prototyped using Meta Quest 3 -- eliminates the need for physical controllers, enhancing ease of use while minimizing potential safety risks associated with direct robot operation. A preliminary user demonstration successfully validated the system's functionality, demonstrating its potential for safer, more intuitive, and immersive robotic control.