π€ AI Summary
This work addresses the lack of a universal, flexible, and cluster-agnostic workload representation in existing distributed machine learning systems, which hinders efficient design space exploration. To overcome this limitation, the paper introduces Flint, a novel framework that leverages the intermediate representation of machine learning compilers to extract workload graphs for clusters of arbitrary scaleβwithout requiring actual hardware execution. By decoupling workload modeling from underlying hardware specifics and validating accuracy through execution traces, Flint ensures both fidelity and portability. Experimental results demonstrate that Flint effectively enables flexible and efficient design space exploration while substantially reducing evaluation overhead.
π Abstract
Design space exploration for future distributed Machine Learning systems suffers from a lack of readily available workload representation that enables flexible exploration across the stack. We present Flint, a framework that bridges this gap by leveraging the Intermediate Representation of Machine Learning framework compilers. The compiler does the heavy weight lifting of understanding and preserving the behavior of the original model code. Flint can collect the workload representation of arbitrary cluster size because it interfaces with the compiler before hardware execution. We validate the workload graph against post-execution traces and show the flexibility of Flint through a design space exploration case study.