Linear Hashing Is Optimal

📅 2025-05-20

📈 Citations: 0

✨ Influential: 0

career value

214K/year

🤖 AI Summary

This paper resolves an open problem posed by Alon et al.: proving that random linear hashing—defined by an $n imes n$ matrix drawn uniformly from $mathbb{F}_2^{n imes n}$—achieves expected maximum load $Theta(log n / log log n)$ when hashing $n$ balls into $n$ bins, matching the optimal bound for fully random hashing. The authors establish, for the first time, the asymptotic optimality of linear hashing under this canonical load-balancing metric. They further derive a strong tail bound: the probability that any bin’s load exceeds $r cdot Theta(log n / log log n)$ is $O(1/r^2)$. Technically, the analysis overcomes challenges arising from linear dependencies via structural properties of linear maps over finite fields, higher-order moment estimation, refined probabilistic inequalities, and precise modeling of bin-load distributions. This result settles a long-standing theoretical question on the load capacity of linear hashing, open since STOC’97.

Technology Category

Application Category

📝 Abstract

We prove that hashing $n$ balls into $n$ bins via a random matrix over $mathbf{F}_2$ yields expected maximum load $O(log n / log log n)$. This matches the expected maximum load of a fully random function and resolves an open question posed by Alon, Dietzfelbinger, Miltersen, Petrank, and Tardos (STOC '97, JACM '99). More generally, we show that the maximum load exceeds $rcdotlog n/loglog n$ with probability at most $O(1/r^2)$.

Problem

Research questions and friction points this paper is trying to address.

Prove optimality of linear hashing for load balancing

Match maximum load of fully random hash functions

Resolve open question from Alon et al. (STOC '97)

Innovation

Methods, ideas, or system contributions that make the work stand out.

Uses random matrix over F_2 for hashing

Achieves O(log n / log log n) maximum load

Matches fully random function performance

🔎 Similar Papers

BinomialHash: A Constant Time, Minimal Memory Consistent Hash Algorithm

2024-06-28arXiv.orgCitations: 0

💼 Related Jobs

Research Scientist