🤖 AI Summary
This work addresses the limitations of current scientific instrument automation, which is hindered by closed graphical user interfaces (GUIs) and APIs lacking generalizability, impeding seamless integration of high-throughput characterization and intelligent analysis. The authors propose a GUI-native embodied agent system that emulates expert human interaction with instrument interfaces, unifying Type-1 (GUI control) and Type-2 (data analysis) capabilities within a skill-centric framework to enable end-to-end reusable workflows. The system supports ten instrument types—including FTIR, NMR, AFM, and TGA—and has demonstrated cross-instrument and cross-modal versatility and practicality in spectroscopic, microscopic, and crystallographic tasks, offering a scalable infrastructure for autonomous laboratories.
📝 Abstract
Scientific discovery increasingly depends on high-throughput characterization, yet automation is hindered by proprietary GUIs and the limited generalizability of existing API-based systems. We present Owl-AuraID, a software-hardware collaborative embodied agent system that adopts a GUI-native paradigm to operate instruments through the same interfaces as human experts. Its skill-centric framework integrates Type-1 (GUI operation) and Type-2 (data analysis) skills into end-to-end workflows, connecting physical sample handling with scientific interpretation. Owl-AuraID demonstrates broad coverage across ten categories of precision instruments and diverse workflows, including multimodal spectral analysis, microscopic imaging, and crystallographic analysis, supporting modalities such as FTIR, NMR, AFM, and TGA. Overall, Owl-AuraID provides a practical, extensible foundation for autonomous laboratories and illustrates a path toward evolving laboratory intelligence through reusable operational and analytical skills. The code are available at https://github.com/OpenOwlab/AuraID.