Published 'Distributed Ranges: A Model for Distributed Data Structures, Algorithms, and Views' at ICS'24.
Published 'RDMA-Based Algorithms for Sparse Matrix Multiplication on GPUs' at ICS'24.
Completed PhD thesis 'RDMA-Based Distributed Data Structures for Large Scale Parallel Systems' in 2022.
Published 'Scalable Irregular Parallelism with GPUs: Getting CPUs Out of the Way' at SC 2022.
Published 'Distributed-Memory Parallel Algorithms for Sparse Times Tall-Skinny-Dense Matrix Multiplication' at ICS 2021.
Published 'RDMA vs. RPC for Implementing Distributed Data Structures' at IA³ 2019.
Published 'BCL: A Cross-Platform Distributed Data Structures Library' at ICPP 2019.
Published 'UniSparse: An Intermediate Language for General Sparse Format Customization' at OOPSLA 2024.
Published 'C++ and Interoperability Across Libraries: The GraphBLAS C++ API' at GrAPL 2023.
Co-authored 'The GraphBLAS C API Specification' technical report.
Research Experience
At Intel Labs, leading the development of Distributed Ranges, a model for distributed data structures, algorithms, and views in distributed programming environments.
During PhD, researched RDMA-based distributed data structures and developed the Berkeley Container Library (BCL), a cross-platform C++ library.
Working on the GraphBLAS project, developing the upcoming C++ API for implementing graph algorithms using linear algebra.
Created reple, a 'replay-based' REPL supporting compiled languages including C, C++, MPI, UPC, UPC++, BCL, Go, Rust, and more.
Developed BUtil, offering clean C++ bindings for MPI and a declarative syntax for point-to-point communication.