Matrix-Free Evaluation Strategies for Continuous and Discontinuous Galerkin Discretizations on Unstructured Tetrahedral Grids

📅 2025-09-12
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the performance bottleneck of matrix-free evaluation for continuous and discontinuous Galerkin (CG/DG) discretizations on unstructured tetrahedral meshes. To accelerate node-level computations—particularly for low- to medium-order polynomial approximations—the authors propose an efficient strategy comprising three key innovations: (1) replacing sparse global matrix storage with element-wise numerical integration and dense local matrix lookups; (2) substituting traditional matrix-vector operations with optimized matrix-matrix multiplications, enhanced by hierarchical mesh reordering to improve data locality; and (3) dynamically switching between matrix-free and matrix-based approaches within a hybrid multigrid preconditioner. The framework supports both Poisson and incompressible Navier–Stokes equations. For cubic polynomials, it achieves up to 6× speedup over conventional sparse-matrix implementations, attaining over 60% of peak hardware performance while demonstrating excellent strong scalability—enabling scalable, high-fidelity simulations of large-scale practical problems.

Technology Category

Application Category

📝 Abstract
This study presents novel strategies for improving the node-level performance of matrix-free evaluation of continuous and discontinuous Galerkin spatial discretizations on unstructured tetrahedral grids. In our approach the underlying integrals of a generic finite-element operator are computed cell-by-cell through numerical quadrature using tabulated dense local matrices of shape functions, achieving high throughput for low to moderate-order polynomial degrees. By employing dense matrix-matrix products instead of matrix-vector products for the cell-wise interpolation, the method reaches over $60%$ of peak performance. The optimization strategies exploit explicit data parallelism to enhance computational efficiency, complemented by a hierarchical mesh reordering algorithm that improves data locality. The matrix-free implementation achieves up to a $6 imes$ speedup compared to a global sparse matrix-based approach at a polynomial degree of three. The effectiveness of the method is demonstrated through numerical experiments on the Poisson and Navier--Stokes equations. The Poisson operator is preconditioned by a hybrid multigrid scheme that combines auxiliary continuous finite-element spaces, polynomial and geometric coarsening where possible while employing algebraic multigrid on the coarse mesh. Within the preconditioner, the implementation transitions between the matrix-free and matrix-based strategies for optimal efficiency. Finally, we analyze the strong scaling behavior of the Poisson and Helmholtz operators, demonstrating the method's potential to solve large real-world problems.
Problem

Research questions and friction points this paper is trying to address.

Optimizing matrix-free Galerkin discretizations on tetrahedral grids
Enhancing computational efficiency through data parallelism
Improving performance for Poisson and Navier-Stokes equations
Innovation

Methods, ideas, or system contributions that make the work stand out.

Matrix-free evaluation using dense matrix-matrix products
Hierarchical mesh reordering for improved data locality
Hybrid multigrid preconditioner combining matrix-free strategies
🔎 Similar Papers
No similar papers found.