Bandicoot: A Templated C++ Library for GPU Linear Algebra

πŸ“… 2025-08-15
πŸ“ˆ Citations: 0
✨ Influential: 0
πŸ“„ PDF
πŸ€– AI Summary
Existing CPU-based linear algebra libraries (e.g., Armadillo) face significant challenges in efficient GPU porting. To address this, we propose Bandicootβ€”a CUDA-accelerated C++ library that maintains full interface compatibility with Armadillo. Its core innovations include: (1) a compile-time expression template system leveraging CUDA and C++ template metaprogramming to enable delayed evaluation and automatic mathematical expression optimization; and (2) a unified abstraction layer that transparently bridges CPU and GPU execution, minimizing required code modifications. Experimental evaluation on representative linear algebra workloads demonstrates speedups of several-fold to over an order of magnitude over CPU-only Armadillo. Bandicoot is open-sourced under the Apache 2.0 license, facilitating integration into existing Armadillo-based scientific computing pipelines.

Technology Category

Application Category

πŸ“ Abstract
We introduce the Bandicoot C++ library for linear algebra and scientific computing on GPUs, overviewing its user interface and performance characteristics, as well as the technical details of its internal design. Bandicoot is the GPU-enabled counterpart to the well-known Armadillo C++ linear algebra library, aiming to allow users to take advantage of GPU-accelerated computation for their existing codebases without significant changes. Exploiting similar internal template meta-programming techniques that Armadillo uses, Bandicoot is able to provide compile-time optimisation of mathematical expressions within user code, leading to more efficient execution. Empirical evaluations show that Bandicoot can provide significant speedups over Armadillo-based CPU-only computation. Bandicoot is available at https://coot.sourceforge.io and is distributed as open-source software under the permissive Apache 2.0 license.
Problem

Research questions and friction points this paper is trying to address.

Enables GPU-accelerated linear algebra for existing codebases
Provides compile-time optimization of mathematical expressions
Offers significant speedups over CPU-only computation
Innovation

Methods, ideas, or system contributions that make the work stand out.

GPU-accelerated C++ linear algebra library
Template meta-programming for compile-time optimization
Seamless integration with existing Armadillo codebases
πŸ”Ž Similar Papers
No similar papers found.