🤖 AI Summary
Conventional wisdom in open-addressing hash tables posits an inherent space–time trade-off: high load factors inevitably degrade operation latency.
Method: The paper introduces a *non-blind* open-addressing hash table that employs dynamic rehashing, low-independence hash functions (requiring only O(1) pairwise-independent O(1)-wise independent hash functions), a non-blind probing strategy, and constructive probabilistic analysis.
Contribution/Results: This design achieves high-probability constant-time (O(1)) and expected O(1) worst-case time complexity for insertions, deletions, and lookups—even at full capacity (load factor 1.0, i.e., zero redundant slots). It is the first construction to break the long-standing Ω(log log ε⁻¹) lower bound on probe complexity for fully loaded tables. By simultaneously attaining asymptotic space optimality and strong time efficiency, the scheme establishes a new theoretical and practical paradigm for compact hash structures.
📝 Abstract
A hash table is said to be open-addressed (or non-obliviously open-addressed) if it stores elements (and free slots) in an array with no additional metadata. Intuitively, open-addressed hash tables must incur a space-time tradeoff: The higher the load factor at which the hash table operates, the longer insertions/deletions/queries should take. In this paper, we show that no such tradeoff exists: It is possible to construct an open-addressed hash table that supports constant-time operations even when the hash table is entirely full. In fact, it is even possible to construct a version of this data structure that: (1) is dynamically resized so that the number of slots in memory that it uses, at any given moment, is the same as the number of elements it contains; (2) supports $O(1)$-time operations, not just in expectation, but with high probability; and (3) requires external access to just $O(1)$ hash functions that are each just $O(1)$-wise independent. Our results complement a recent lower bound by Bender, Kuszmaul, and Zhou showing that oblivious open-addressed hash tables must incur $Omega(log log varepsilon^{-1})$-time operations. The hash tables in this paper are non-oblivious, which is why they are able to bypass the previous lower bound.