WOW: Workflow-Aware Data Movement and Task Scheduling for Dynamic Scientific Workflows

📅 2025-03-17
📈 Citations: 1
Influential: 0
📄 PDF
🤖 AI Summary
In dynamic scientific workflows, the decoupling of task scheduling from data movement often assigns tasks to nodes lacking local input data, causing network congestion and execution delays. To address this, we propose the first workflow-aware joint scheduling framework that enables “data-readiness-driven” task placement via predictive pre-staging of intermediate data replicas, supporting dynamic execution plans. Our approach integrates speculative data pre-staging, dependency-aware scheduling, and lightweight storage management, implemented on a Nextflow+Kubernetes prototype. Experiments across 16 synthetic and real-world workflows demonstrate significant reductions in total completion time—up to 94.5% (synthetic) and 53.2% (real), with only bounded, transient storage overhead. The core contribution is the first holistic co-optimization of scheduling and data movement for dynamic workflows, breaking the traditional separation between these concerns.

Technology Category

Application Category

📝 Abstract
Scientific workflows process extensive data sets over clusters of independent nodes, which requires a complex stack of infrastructure components, especially a resource manager (RM) for task-to-node assignment, a distributed file system (DFS) for data exchange between tasks, and a workflow engine to control task dependencies. To enable a decoupled development and installation of these components, current architectures place intermediate data files during workflow execution independently of the future workload. In data-intensive applications, this separation results in suboptimal schedules, as tasks are often assigned to nodes lacking input data, causing network traffic and bottlenecks. This paper presents WOW, a new scheduling approach for dynamic scientific workflow systems that steers both data movement and task scheduling to reduce network congestion and overall runtime. For this, WOW creates speculative copies of intermediate files to prepare the execution of subsequently scheduled tasks. WOW supports modern workflow systems that gain flexibility through the dynamic construction of execution plans. We prototypically implemented WOW for the popular workflow engine Nextflow using Kubernetes as a resource manager. In experiments with 16 synthetic and real workflows, WOW reduced makespan in all cases, with improvement of up to 94.5% for workflow patterns and up to 53.2% for real workflows, at a moderate increase of temporary storage space. It also has favorable effects on CPU allocation and scales well with increasing cluster size.
Problem

Research questions and friction points this paper is trying to address.

Optimizes task scheduling and data movement in scientific workflows
Reduces network congestion and overall workflow runtime
Improves resource allocation and scalability in cluster environments
Innovation

Methods, ideas, or system contributions that make the work stand out.

Dynamic task and data scheduling reduces network congestion.
Speculative file copies optimize subsequent task execution.
Integration with Kubernetes and Nextflow enhances workflow efficiency.
🔎 Similar Papers
No similar papers found.
Fabian Lehmann
Fabian Lehmann
Ph.D. candidate, Humboldt-Universität zu Berlin
adaptive scheduling of large workflows
Jonathan Bader
Jonathan Bader
TU Berlin
Resource ManagementDistributed SystemsScientific Workflows
F
Friedrich Tschirpke
Humboldt-Universität zu Berlin, Germany
N
Ninon De Mecquenem
Humboldt-Universität zu Berlin, Germany
A
Ansgar Lößer
Technische Universität Darmstadt, Germany
Soeren Becker
Soeren Becker
TU Berlin
Distributed SystemsEdge ComputingSelf-adaptive systemsAIOps
K
Katarzyna Ewa Lewi'nska
Humboldt-Universität zu Berlin, Germany
L
L. Thamsen
University of Glasgow, UK
U
Ulf Leser
Humboldt-Universität zu Berlin, Germany