Anonymized Network Sensing using C++26 std::execution on GPUs

📅 2025-10-15
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address the challenges of complex GPU memory management and difficult workload portability in large-scale network traffic analysis, this paper proposes a multi-GPU anonymous network-aware computing framework built upon the C++26 `std::execution` asynchronous programming model. It pioneers the integration of Sender/Receiver semantics into network-aware graph computation, elevating GPUs to first-class execution entities and enabling cross-device composable asynchronous scheduling with fine-grained memory control. The framework constructs task chains using standardized execution policies, balancing developer productivity and peak performance. Evaluated on an 8-GPU NVIDIA A100 system, it achieves up to 55× speedup over a serial GraphBLAS baseline, significantly improving parallel efficiency and scalability. This work establishes a novel paradigm for real-time network analysis in high-density GPU environments.

Technology Category

Application Category

📝 Abstract
Large-scale network sensing plays a vital role in network traffic analysis and characterization. As network packet data grows increasingly large, parallel methods have become mainstream for network analytics. While effective, GPU-based implementations still face start-up challenges in host-device memory management and porting complex workloads on devices, among others. To mitigate these challenges, composable frameworks have emerged using modern C++ programming language, for efficiently deploying analytics tasks on GPUs. Specifically, the recent C++26 Senders model of asynchronous data operation chaining provides a simple interface for bulk pushing tasks to varied device execution contexts. Considering the prominence of contemporary dense-GPU platforms and vendor-leveraged software libraries, such a programming model consider GPUs as first-class execution resources (compared to traditional host-centric programming models), allowing convenient development of multi-GPU application workloads via expressive and standardized asynchronous semantics. In this paper, we discuss practical aspects of developing the Anonymized Network Sensing Graph Challenge on dense-GPU systems using the recently proposed C++26 Senders model. Adopting a generic and productive programming model does not necessarily impact the critical-path performance (as compared to low-level proprietary vendor-based programming models): our commodity library-based implementation achieves up to 55x performance improvements on 8x NVIDIA A100 GPUs as compared to the reference serial GraphBLAS baseline.
Problem

Research questions and friction points this paper is trying to address.

Addresses GPU memory management challenges in network sensing
Enables portable deployment of analytics on multi-GPU systems
Achieves high performance using C++26 asynchronous execution model
Innovation

Methods, ideas, or system contributions that make the work stand out.

Uses C++26 Senders model for asynchronous operations
Leverages GPUs as first-class execution resources
Achieves performance via composable library-based implementation
🔎 Similar Papers
M
Michael Mandulak
Rensselaer Polytechnic Institute, Troy, NY , USA
S
Sayan Ghosh
Pacific Northwest National Laboratory, Richland, WA, USA
S M Ferdous
S M Ferdous
Data Scientist, PNNL
Combinatorial Scientific ComputingHigh Performance ComputingAlgorithm Engineering
Mahantesh Halappanavar
Mahantesh Halappanavar
Chief Data Scientist & Group Leader, Pacific Northwest National Laboratory
Graph algorithmsparallel computingartificial intelligencemachine learningcombinatorial optmz
G
George Slota
Rensselaer Polytechnic Institute, Troy, NY , USA