🤖 AI Summary
This work addresses the underutilization of elasticity in scientific computing, where process variability—though beneficial for resource efficiency—is hindered by development complexity. To bridge this gap, the authors propose DMRlib, a library featuring MPI-like interfaces and predefined communication patterns that substantially lower the barrier to developing elastic applications. DMRlib supports both rigid and malleable job submissions and incorporates an integrated elastic resource scheduling mechanism. Experimental results demonstrate that, compared to conventional non-elastic jobs, the proposed approach improves system throughput by over threefold while simultaneously enhancing resource allocation efficiency and reducing energy consumption, effectively reconciling the benefits of elasticity with practical usability.
📝 Abstract
Process malleability has proved to have a highly positive impact on the resource utilization and global productivity in data centers compared with the conventional static resource allocation policy. However, the non-negligible additional development effort this solution imposes has constrained its adoption by the scientific programming community. In this work, we present DMRlib, a library designed to offer the global advantages of process malleability while providing a minimalist MPI-like syntax. The library includes a series of predefined communication patterns that greatly ease the development of malleable applications. In addition, we deploy several scenarios to demonstrate the positive impact of process malleability featuring different scalability patterns. Concretely, we study two job submission modes (rigid and moldable) in order to identify the best-case scenarios for malleability using metrics such as resource allocation rate, completed jobs per second, and energy consumption. The experiments prove that our elastic approach may improve global throughput by a factor higher than 3x compared to the traditional workloads of non-malleable jobs.