THAPI: Tracing Heterogeneous APIs

📅 2025-03-22
📈 Citations: 1
Influential: 0
📄 PDF
🤖 AI Summary
Programming heterogeneous exascale HPC systems is hindered by the proliferation of complex, incompatible programming models (e.g., CUDA, SYCL, OpenMP) and the lack of traceability across CPU/GPU execution contexts. Method: We propose the first semantic-aware, full-stack API tracing framework built upon LTTng kernel tracing, integrating user-space dynamic instrumentation with multi-model API signature parsing to capture fine-grained, low-overhead, configurable API call chains across hardware and programming abstractions. Contribution/Results: Unlike conventional tracers logging only function names and timestamps, our framework enables cross-vendor, cross-abstraction behavioral correlation and end-to-end call-chain reconstruction in real HPC applications. It accurately identifies cross-model performance bottlenecks and implementation flaws, improving debugging efficiency by over 3× and significantly enhancing portability and debuggability of heterogeneous programming models.

Technology Category

Application Category

📝 Abstract
As we reach exascale, production High Performance Computing (HPC) systems are increasing in complexity. These systems now comprise multiple heterogeneous computing components (CPUs and GPUs) utilized through diverse, often vendor-specific programming models. As application developers and programming models experts develop higher-level, portable programming models for these systems, debugging and performance optimization requires understanding how multiple programming models stacked on top of each other interact with one another. This paper discusses THAPI (Tracing Heterogeneous APIs), a portable, programming model-centric tracing framework: by capturing comprehensive API call details across layers of the HPC software stack, THAPI enables fine-grained understanding and analysis of how applications interact with programming models and heterogeneous hardware. Leveraging state of the art tracing f ramework like the Linux Trace Toolkit Next Generation (LTTng) and tracing much more than other tracing toolkits, focused on function names and timestamps, this approach enables us to diagnose performance bottlenecks across the software stack, optimize application behavior, and debug programming model implementation issues.
Problem

Research questions and friction points this paper is trying to address.

Understanding interactions between multiple programming models in HPC systems
Debugging and optimizing performance in heterogeneous computing environments
Capturing detailed API calls across HPC software stack layers
Innovation

Methods, ideas, or system contributions that make the work stand out.

Portable tracing framework for HPC
Captures multi-layer API call details
Uses LTTng for comprehensive performance analysis
🔎 Similar Papers
No similar papers found.
S
Solomon Bekele
Argonne National Laboratory
A
Aurelio Vivas
University De Los Andes - Colombia
T
T. Applencourt
Argonne National Laboratory
S
S. Muralidharan
Argonne National Laboratory
B
Bryce Allen
Argonne National Laboratory
K
Kazutomo Yoshiiinst
Argonne National Laboratory
Swann Perarnau
Swann Perarnau
Argonne National Laboratory
Operating SystemsHPCSchedulingPerformance EvaluationMemory
B
Brice Videau Argonne National Laboratory
Argonne National Laboratory