A Multimodal Manufacturing Safety Chatbot: Knowledge Base Design, Benchmark Development, and Evaluation of Multiple RAG Approaches

📅 2025-11-14

📈 Citations: 0

✨ Influential: 0

career value

226K/year

🤖 AI Summary

To address the urgent need for high-accuracy, low-latency, and cost-effective safety training in Industry 5.0’s human-centric manufacturing, this paper proposes a multimodal safety training chatbot tailored for manufacturing. Methodologically, it integrates large language models (LLMs) with retrieval-augmented generation (RAG), supports text and image inputs, and employs domain-specific vector retrieval alongside full-factor experimental design for parameter optimization. Key contributions include: (1) the first benchmark dataset for manufacturing safety training; (2) an open-source, reusable industry-knowledge chatbot framework; and (3) a systematic evaluation methodology. Experiments demonstrate that the optimal configuration achieves 86.66% accuracy, an average response latency of 10.04 seconds, and a per-query cost of only $0.005. Validated by ten domain experts, the system significantly advances the practical deployment of AI in industrial safety education.

Technology Category

Application Category

📝 Abstract

Ensuring worker safety remains a critical challenge in modern manufacturing environments. Industry 5.0 reorients the prevailing manufacturing paradigm toward more human-centric operations. Using a design science research methodology, we identify three essential requirements for next-generation safety training systems: high accuracy, low latency, and low cost. We introduce a multimodal chatbot powered by large language models that meets these design requirements. The chatbot uses retrieval-augmented generation to ground its responses in curated regulatory and technical documentation. To evaluate our solution, we developed a domain-specific benchmark of expert-validated question and answer pairs for three representative machines: a Bridgeport manual mill, a Haas TL-1 CNC lathe, and a Universal Robots UR5e collaborative robot. We tested 24 RAG configurations using a full-factorial design and assessed them with automated evaluations of correctness, latency, and cost. Our top 2 configurations were then evaluated by ten industry experts and academic researchers. Our results show that retrieval strategy and model configuration have a significant impact on performance. The top configuration (selected for chatbot deployment) achieved an accuracy of 86.66%, an average latency of 10.04 seconds, and an average cost of $0.005 per query. Overall, our work provides three contributions: an open-source, domain-grounded safety training chatbot; a validated benchmark for evaluating AI-assisted safety instruction; and a systematic methodology for designing and assessing AI-enabled instructional and immersive safety training systems for Industry 5.0 environments.

Problem

Research questions and friction points this paper is trying to address.

Developing a multimodal chatbot for manufacturing worker safety training

Creating accurate, low-latency, low-cost safety training systems for Industry 5.0

Evaluating RAG approaches for domain-specific safety instruction benchmarks

Innovation

Methods, ideas, or system contributions that make the work stand out.

Multimodal chatbot using large language models

Retrieval-augmented generation with curated documentation

Systematic evaluation of 24 RAG configurations

🔎 Similar Papers

No similar papers found.