🤖 AI Summary
This work addresses the challenge of designing dynamic-key-length hash maps in functional array languages that simultaneously achieve efficiency, polymorphism, and parallelism. We propose a data-parallel, two-level static conflict-free hash map. Our method functionally formalizes and flattens Fredman et al.’s theoretical construction for the first time; introduces a context-aware mechanism enabling type-safe handling of arbitrary-length keys; and develops a polymorphic abstract interface in Futhark that preserves pure functional semantics while enabling high-performance compilation and parallel execution. Experimental results show that our implementation constructs hash maps significantly faster than cuCollections and outperforms traditional balanced-tree structures in lookup performance. Moreover, it uncovers a novel co-design pathway between low-level hash algorithm expression and compiler optimization in functional languages.
📝 Abstract
We present a systematic derivation of a data-parallel implementation of two-level, static and collision-free hash maps, by giving a functional formulation of the Fredman et al. construction, and then flattening it. We discuss the challenges of providing a flexible, polymorphic, and abstract interface to hash maps in a functional array language, with particular attention paid to the problem of dynamically sized keys, which we address by associating each hash map with an arbitrary context. The algorithm is implemented in Futhark, and the achieved GPU execution performance is compared on simple benchmark problems. We find that our hash maps outperform conventional tree/search-based approaches. Furthermore, our implementation is compared against the state-of-the-art cuCollections library, which is significantly faster for hash map construction, and to a lesser degree for lookups. We explain to which extent the performance difference is due to low-level code generation limitation in the Futhark compiler, and to which extent it can be attributed to the data-parallel programming vocabulary not providing the constructs necessary to express the equivalent of the algorithms used by cuCollections. We end by reflecting to which extent the functional array language programming model could, or should, be extended to address these weaknesses.