More Bang For Your Buck(et): Fast and Space-efficient Hardware-accelerated Coarse-granular Indexing on GPUs

πŸ“… 2024-06-06
πŸ›οΈ arXiv.org
πŸ“ˆ Citations: 3
✨ Influential: 0
πŸ“„ PDF
πŸ€– AI Summary
Existing RTIndex (RX) suffers from three key bottlenecks when building database indexes on GPU ray-tracing hardware: high per-key memory overhead, slow range queries, and poor update efficiency. This paper proposes cgRXβ€”the first RT Core-native indexing method designed specifically for coarse-grained bucket indexing. cgRX models keys as 3D triangular buckets and jointly optimizes space efficiency, query throughput, and dynamic update capability via serialized ray casting coupled with intra-bucket post-filtering. Compared to RX, cgRX achieves 1.5–3Γ— higher memory-throughput efficiency, accelerates range queries by 2Γ—, and performs single-key updates 5.5Γ— faster than full index reconstruction. Crucially, cgRX is the first approach to simultaneously guarantee correctness and deliver performance gains for coarse-grained indexing on ray-tracing hardware.

Technology Category

Application Category

πŸ“ Abstract
In recent work, we have shown that NVIDIA's raytracing cores on RTX video cards can be exploited to realize hardware-accelerated lookups for GPU-resident database indexes. On a high level, the concept materializes all keys as triangles in a 3D scene and indexes them. Lookups are performed by firing rays into the scene and utilizing the index structure to detect hits in a hardware-accelerated fashion. While this approach called RTIndeX (or short RX) is indeed promising, it currently suffers from three limitations: (1) significant memory overhead per key, (2) slow range-lookups, and (3) poor updateability. In this work, we show that all three problems can be tackled by a single design change: Generalizing RX to become a coarse-granular index cgRX. Instead of indexing individual keys, cgRX indexes buckets of keys which are post-filtered after retrieval. This drastically reduces the memory overhead, leads to the generation of a smaller and more efficient index structure, and enables fast range-lookups as well as updates. We will see that representing the buckets in the 3D space such that the lookup of a key is performed both correctly and efficiently requires the careful orchestration of firing rays in a specific sequence. Our experimental evaluation shows that cgRX offers the most bang for the buck(et) by providing a throughput in relation to the memory footprint that is 1.5-3x higher than for the comparable range-lookup supporting baselines. At the same time, cgRX improves the range-lookup performance over RX by up to 2x and offers practical updateability that is up to 5.5x faster than rebuilding from scratch.
Problem

Research questions and friction points this paper is trying to address.

Reduce memory overhead per key in GPU-resident database indexes
Improve slow range-lookup performance in hardware-accelerated indexing
Enhance updateability of the index structure for dynamic data
Innovation

Methods, ideas, or system contributions that make the work stand out.

Hardware-accelerated indexing using GPU raytracing cores
Coarse-granular indexing with key buckets for efficiency
Optimized ray-firing sequence for correct and fast lookups
πŸ”Ž Similar Papers
No similar papers found.
J
Justus Henneberg
Johannes Gutenberg University Mainz, Germany
F
F. Schuhknecht
Johannes Gutenberg University Mainz, Germany
R
Rosina F. Kharal
University of Waterloo, Canada
Trevor Brown
Trevor Brown
University of Waterloo
Concurrencyshared memorynon-blocking data structurestransactional memory