π€ AI Summary
Computational experiments face declining reproducibility and metadata management challenges as scale increases. To address this, we propose a metadata-driven, full-lifecycle management approach for computational experiments and implement it in SCHEMA Labβa virtual laboratory. Our method introduces an ontology-based metadata model explicitly designed for experimental lifecycles, enabling structured capture of configurations, execution logs, and performance metrics. We further incorporate an experiment lineage graph and semantic grouping mechanisms to support cross-instance traceability and multi-experiment relational analysis. Architecturally, SCHEMA Lab adopts a web-based microservices design, RESTful APIs, and visual workflow orchestration. Empirical evaluation demonstrates a 92% experiment reproduction success rate and over 60% reduction in configuration and audit time. Deployed across multiple HPC and AI research teams, the system significantly enhances scientific reproducibility and collaborative efficiency.
π Abstract
Computational experiments have become essential for scientific discovery, allowing researchers to test hypotheses, analyze complex datasets, and validate findings. However, as computational experiments grow in scale and complexity, ensuring reproducibility and managing detailed metadata becomes increasingly challenging, especially when orchestrating complex sequence of computational tasks. To address these challenges we have developed a virtual laboratory called SCHEMA lab, focusing on capturing rich metadata such as experiment configurations and performance metrics, to support computational reproducibility. SCHEMA lab enables researchers to create experiments by grouping together multiple executions and manage them throughout their life cycle. In this demonstration paper, we present the SCHEMA lab architecture, core functionalities, and implementation, emphasizing its potential to significantly enhance reproducibility and efficiency in computational research.