🤖 AI Summary
This work addresses the challenges of complex boundary condition handling, code redundancy, and poor distributed efficiency in partial differential equation solvers on block-structured grids. The authors propose a unified modeling approach that expresses user-defined boundary conditions as affine sparse linear operators and, for the first time, systematically reformulates them into sparse matrix-vector multiplication (SpMV) form. Leveraging a domain-specific language (DSL) and compiler techniques—combined with multi-stage programming and polyhedral analysis—the framework automatically generates highly optimized matrix-free or sparse matrix kernels while optimizing communication scheduling and reuse. The method achieves substantial performance gains, demonstrating 72%–88% strong scaling efficiency on 1,344 CPU cores, up to 7.6× acceleration in boundary computation kernels, and a reduction of over 70% in code size.
📝 Abstract
Boundary-condition (BC) handling is a major source of complexity in PDE solvers on structured and block-structured grids, especially for high-order methods and distributed-memory execution. We present Mat2Boundary, a DSL and compiler for boundary computations that models a broad class of boundary-conditions as affine sparse linear operators. This abstraction unifies halo copying, circular and symmetric mappings, zero padding, block-edge synchronization, and user-defined interpolation, while exposing a modular basic sub-matrix interface for declarative composition. To make this representation efficient, Mat2Boundary combines multi-stage programming and polyhedral analysis to generate matrix-free kernels for structured cases, support user-defined sparse matrices for irregular cases, eliminate redundant boundary work, and synthesize reusable communication schedules for distributed execution. Evaluated on two shallow-water equation solvers on cubed-sphere grids and HPCG, Mat2Boundary achieves up to 7.6$\times$ BC-kernel speedup, reduces BC code by over 70%, and scales to 1,344 CPU cores with 72%-88% efficiency.