Nf-PEAK: Process-Based Energy Attribution for Nextflow Workflows on Kubernetes Clusters

📅 2026-05-21

📈 Citations: 0

✨ Influential: 0

career value

196K/year

🤖 AI Summary

This work addresses the lack of fine-grained, accurate task-level energy attribution for Nextflow scientific workflows in shared Kubernetes clusters. The authors propose a container-based energy accounting approach that links Pods to host processes via cgroup metadata and leverages hardware-level measurements from Intel RAPL and perf performance counters. To handle concurrent interference in resource-constrained environments, they introduce a nonlinear energy allocation model that enables high-precision energy tracking at the task level. Experimental evaluation demonstrates that the method achieves mean absolute percentage errors as low as 6.6% under no-interference conditions and 10.9% under high interference, substantially outperforming the state-of-the-art tool Kepler—particularly in scenarios involving intense resource contention, where it exhibits superior accuracy and stability.

📝 Abstract

Scientific workflows are pipelines of interdependent tasks. They are increasingly executed on shared Kubernetes clusters via workflow engines such as Nextflow. Their energy consumption matters for both cost and sustainability. It is necessary to examine and optimize workflow tasks individually, because they can be very heterogeneous. However, estimating task-level energy on clusters is difficult: Intel RAPL counters report only node-level energy, access to counters and host process information is typically restricted, and concurrent workloads introduce resource contention and measurement noise. We present Nf-PEAK, a containerized method to attribute CPU-package and DRAM energy to individual processes and Nextflow tasks. Nf-PEAK (i) identifies workflow pods, (ii) maps pods to host processes via cgroup metadata, (iii) samples RAPL and per-process performance counters, and (iv) applies a non-linear energy-credit model before aggregating results at task level. On a Kubernetes cluster, we evaluate three nf-core workflows under controlled co-located CPU load. Nf-PEAK reaches an average Mean Absolute Percentage Error of 6.6% in isolated runs and 10.9% when an unrelated workload saturates 8 of 32 hardware threads per node, and remains stable across 2, 3, 4, and 8 nodes. Compared to the state-of-the-art Kubernetes tool Kepler, Nf-PEAK yields lower error on average, particularly under co-located load.

Problem

Research questions and friction points this paper is trying to address.

energy attribution

scientific workflows

Kubernetes

Nextflow

task-level energy

Innovation

Methods, ideas, or system contributions that make the work stand out.

energy attribution

Nextflow

Kubernetes