Cross-Format Retrieval-Augmented Generation in XR with LLMs for Context-Aware Maintenance Assistance

📅 2025-02-21
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address the low efficiency and poor accuracy of information retrieval and maintenance instruction generation for technicians handling heterogeneous multimodal data (e.g., text, images, 3D models) in XR environments, this paper proposes the first cross-format Retrieval-Augmented Generation (RAG) framework tailored for industrial XR. The framework achieves unified retrieval via cross-modal semantic alignment and integrates large language models (LLMs)—specifically GPT-4 and GPT-4o-mini—to generate context-aware maintenance instructions. Its key innovation lies in the first end-to-end integration of joint multimodal retrieval and LLM-based instruction generation within an XR runtime. Experimental results demonstrate a 37% improvement in instruction response accuracy and an average latency of 1.18 seconds; for complex queries, BLEU and METEOR scores reach 42.6 and 48.3, respectively—validating the framework’s superior real-time performance, accuracy, and industrial applicability.

Technology Category

Application Category

📝 Abstract
This paper presents a detailed evaluation of a Retrieval-Augmented Generation (RAG) system that integrates large language models (LLMs) to enhance information retrieval and instruction generation for maintenance personnel across diverse data formats. We assessed the performance of eight LLMs, emphasizing key metrics such as response speed and accuracy, which were quantified using BLEU and METEOR scores. Our findings reveal that advanced models like GPT-4 and GPT-4o-mini significantly outperform their counterparts, particularly when addressing complex queries requiring multi-format data integration. The results validate the system's ability to deliver timely and accurate responses, highlighting the potential of RAG frameworks to optimize maintenance operations. Future research will focus on refining retrieval techniques for these models and enhancing response generation, particularly for intricate scenarios, ultimately improving the system's practical applicability in dynamic real-world environments.
Problem

Research questions and friction points this paper is trying to address.

Enhance information retrieval for maintenance
Integrate multi-format data with LLMs
Optimize response accuracy and speed
Innovation

Methods, ideas, or system contributions that make the work stand out.

Integrates LLMs for enhanced retrieval
Assesses models using BLEU, METEOR metrics
Focuses on multi-format data integration