🤖 AI Summary
This study addresses the “last-mile” challenge in experimental practice—the difficulty of effectively capturing and transferring localized expertise, safety nuances, and procedural tacit knowledge—by proposing a human-in-the-loop AI assistant. The system uniquely integrates first-person video analysis, multimodal AI, and retrieval-augmented generation (RAG) to extract site-specific implicit knowledge from student-recorded procedure videos and generate traceable safety manuals. It incorporates a novel dual-layer safety mechanism and constrained system prompts to suppress hallucinations, ensuring that AI augments rather than replaces human judgment under expert supervision. Expert evaluations demonstrate that the generated content achieves high practical utility (3.25/4.00) and perfect safety compliance (4.00/4.00), accurately answering in-scope queries while appropriately rejecting out-of-domain questions, thereby significantly reducing operational risk.
📝 Abstract
Advances in Materials Informatics have accelerated the development of Self-Driving Laboratories (SDLs), yet human-led experiments remain standard in many educational and exploratory research settings. In such environments, practical know-how, including operational details and site-specific rules, is essential for safe and reliable laboratory work. In this proof-of-concept study, we developed a human-in-the-loop AI assistant that combines first-person experimental video, multimodal AI, and retrieval-augmented generation (RAG). Using powder X-ray diffraction experiments and student-recorded video data as inputs, the system extracts site-specific laboratory knowledge from recorded procedures, including physical techniques and audible confirmation that conventional manuals could omit. It then provides grounded responses based on the resulting manual. To reduce the risk of unsupported outputs, the system employs a two-layer safety design: source restriction through RAG and strict system-prompt constraints. Instructor-based evaluation showed alignment with expected guidance for questions covered by the manual. For out-of-scope queries, the system appropriately refused to answer, indicating a reduced risk of hallucination. Expert evaluation further indicated that the generated advisory reports were useful and safe (utility: 3.25/4.00; safety: 4.00/4.00). These results suggest a framework in which AI supports laboratory practice under explicit human supervision rather than replacing human judgment.