Linear Hashing Is Optimal

📅 2025-05-20
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This paper resolves an open problem posed by Alon et al.: proving that random linear hashing—defined by an $n imes n$ matrix drawn uniformly from $mathbb{F}_2^{n imes n}$—achieves expected maximum load $Theta(log n / log log n)$ when hashing $n$ balls into $n$ bins, matching the optimal bound for fully random hashing. The authors establish, for the first time, the asymptotic optimality of linear hashing under this canonical load-balancing metric. They further derive a strong tail bound: the probability that any bin’s load exceeds $r cdot Theta(log n / log log n)$ is $O(1/r^2)$. Technically, the analysis overcomes challenges arising from linear dependencies via structural properties of linear maps over finite fields, higher-order moment estimation, refined probabilistic inequalities, and precise modeling of bin-load distributions. This result settles a long-standing theoretical question on the load capacity of linear hashing, open since STOC’97.

Technology Category

Application Category

📝 Abstract
We prove that hashing $n$ balls into $n$ bins via a random matrix over $mathbf{F}_2$ yields expected maximum load $O(log n / log log n)$. This matches the expected maximum load of a fully random function and resolves an open question posed by Alon, Dietzfelbinger, Miltersen, Petrank, and Tardos (STOC '97, JACM '99). More generally, we show that the maximum load exceeds $rcdotlog n/loglog n$ with probability at most $O(1/r^2)$.
Problem

Research questions and friction points this paper is trying to address.

Prove optimality of linear hashing for load balancing
Match maximum load of fully random hash functions
Resolve open question from Alon et al. (STOC '97)
Innovation

Methods, ideas, or system contributions that make the work stand out.

Uses random matrix over F_2 for hashing
Achieves O(log n / log log n) maximum load
Matches fully random function performance
🔎 Similar Papers
M
Michael Jaber
Department of Computer Science, University of Texas at Austin
V
Vinayak M. Kumar
Department of Computer Science, University of Texas at Austin
David Zuckerman
David Zuckerman
Professor of Computer Science, University of Texas at Austin
PseudorandomnessRandomnessTheory of ComputationComplexity Theory