🤖 AI Summary
Scientific workflows (e.g., Taverna) frequently suffer from decay due to service deprecation, outdated dependencies, and system obsolescence, severely impeding knowledge reuse. To address this, we propose a generative AI–driven workflow revival framework. First, it semantically parses decaying workflows; then, it jointly leverages service-matching algorithms and large language models to automate service replacement and cross-platform migration (e.g., to Snakemake or VisFlow). A human-in-the-loop verification mechanism and progressive visual analytics ensure reconstruction reliability. Finally, an open-source, crowdsourced platform facilitates community collaboration for repair and functional validation. Evaluated on multiple real-world Taverna workflows, our approach significantly reduces manual parsing effort while preserving expert oversight at critical decision points. To the best of our knowledge, this is the first framework to establish a closed-loop revival pipeline—from automated diagnosis and intelligent reconstruction to verifiable, reusable workflows.
📝 Abstract
Scientific workflows encode valuable domain expertise and computational methodologies. Yet studies consistently show that a significant proportion of published workflows suffer from decay over time. This problem is particularly acute for legacy workflow systems like Taverna, where discontinued services, obsolete dependencies, and system retirement render previously functional workflows unusable. We present a novel legacy workflow migration system, called CodeR$^3$ (stands for Code Repair, Revival and Reuse), that leverages generative AI to analyze the characteristics of decayed workflows, reproduce them into modern workflow technologies like Snakemake and VisFlow. Our system additionally integrates stepwise workflow analysis visualization, automated service substitution, and human-in-the-loop validation. Through several case studies of Taverna workflow revival, we demonstrate the feasibility of this approach while identifying key challenges that require human oversight. Our findings reveal that automation significantly reduces manual effort in workflow parsing and service identification. However, critical tasks such as service substitution and data validation still require domain expertise. Our result will be a crowdsourcing platform that enables the community to collaboratively revive decayed workflows and validate the functionality and correctness of revived workflows. This work contributes a framework for workflow revival that balances automation efficiency with necessary human judgment.