A shared compilation stack for distributed-memory parallelism in stencil DSLs

📅 2024-04-02
🏛️ International Conference on Architectural Support for Programming Languages and Operating Systems
📈 Citations: 4
Influential: 0
📄 PDF
🤖 AI Summary
High-performance computing (HPC) stencil domain-specific language (DSL) compilers suffer from high development costs, poor infrastructure reuse, and low maintainability due to isolated, ad hoc designs. To address these challenges, this paper proposes MLIR-HPC, a dedicated extensible compiler framework for HPC built on the MLIR infrastructure. Our method introduces three key innovations: (1) a novel message-passing abstraction for distributed-memory systems that uniformly models communication semantics; (2) a distributed stencil intermediate representation (IR) supporting automated communication generation and cross-DSL optimization passes; and (3) seamless integration with three major DSL backends—Devito, PSyclone, and Open Earth Compiler—enabling shared compilation stack infrastructure. Evaluated across heterogeneous supercomputing architectures, the framework supports all three stencil DSLs using a unified core, achieving industrial-grade compilation efficiency and execution performance. Results demonstrate significantly enhanced sustainability, reusability, and evolutionary capability for HPC DSL compilers.

Technology Category

Application Category

📝 Abstract
Domain Specific Languages (DSLs) increase programmer productivity and provide high performance. Their targeted abstractions allow scientists to express problems at a high level, providing rich details that optimizing compilers can exploit to target current- and next-generation supercomputers. The convenience and performance of DSLs come with significant development and maintenance costs. The siloed design of DSL compilers and the resulting inability to benefit from shared infrastructure cause uncertainties around longevity and the adoption of DSLs at scale. By tailoring the broadly-adopted MLIR compiler framework to HPC, we bring the same synergies that the machine learning community already exploits across their DSLs (e.g. Tensorflow, PyTorch) to the finite-difference stencil HPC community. We introduce new HPC-specific abstractions for message passing targeting distributed stencil computations. We demonstrate the sharing of common components across three distinct HPC stencil-DSL compilers: Devito, PSyclone, and the Open Earth Compiler, showing that our framework generates high-performance executables based upon a shared compiler ecosystem.
Problem

Research questions and friction points this paper is trying to address.

Develop shared compiler infrastructure for distributed-memory parallelism in stencil DSLs.
Reduce development and maintenance costs of DSL compilers in HPC.
Enable high-performance executables across multiple HPC stencil-DSL compilers.
Innovation

Methods, ideas, or system contributions that make the work stand out.

Tailored MLIR framework for HPC stencil computations
Introduced HPC-specific message passing abstractions
Shared components across multiple HPC stencil-DSL compilers
🔎 Similar Papers