Static Retrieval Revisited: To Optimality and Beyond

📅 2025-10-20

📈 Citations: 0

✨ Influential: 0

career value

242K/year

🤖 AI Summary

For the static retrieval problem—mapping $n$ keys to $v$-bit values—the feasibility of achieving $O(1)$ query time using $nv + o(n)$ bits of space when $v = Theta(log n)$ has remained open for years. Method: We establish a tight information-theoretic lower bound, proving that $O(1)$ query time and $nv + o(n)$ space are incompatible for large $v$. We then introduce a *joint storage framework*, enabling multiple retrieval structures (e.g., $D_1 cup D_2$) to share underlying memory coherently. Our approach integrates information-theoretic analysis, word-RAM modeling, hashing, and compression techniques. Contribution/Results: We resolve the long-standing complexity question by showing a fundamental time–space trade-off in the large-value regime. The joint framework achieves total space $nv + mathrm{Space}(D_2) + ilde{O}(n^{2/3})$, while preserving the optimal query time of each constituent structure—thereby breaking classical redundancy barriers and enabling significant space savings without sacrificing efficiency.

Technology Category

Application Category

📝 Abstract

In the static retrieval problem, a data structure must answer retrieval queries mapping a set of $n$ keys in a universe $[U]$ to $v$-bit values. Information-theoretically, retrieval data structures can use as little as $nv$ bits of space. For small value sizes $v$, it is possible to achieve $O(1)$ query time while using space $nv + o(n)$ bits -- whether or not such a result is possible for larger values of $v$ (e.g., $v = Θ(log n)$) has remained open. In this paper, we obtain a tight lower bound (as well as matching upper bounds) for the static retrieval problem. In the case where values are large, we show that there is actually a significant tension between time and space. It is not possible, for example, to get $O(1)$ query time using $nv + o(n)$ bits of space, when $v = Θ(log n)$ (and assuming the word RAM model with $O(log n)$-bit words). At first glance, our lower bound would seem to render retrieval unusable in many settings that aim to achieve very low redundancy. However, our second result offers a way around this: We show that, whenever a retrieval data structure $D_1$ is stored along with another data structure $D_2$ (whose size is similar to or larger than the size of $D_1$), it is possible to implement the combined data structure $D_1 cup D_2$ so that queries to $D_1$ take $O(1)$ time, operations on $D_2$ take the same asymptotic time as if $D_2$ were stored on its own, and the total space is $nv + mathrm{Space}(D_2) + n^{0.67}$ bits.

Problem

Research questions and friction points this paper is trying to address.

Establishing tight bounds for static retrieval with large values

Resolving tension between query time and space efficiency

Enabling efficient retrieval when combined with another data structure

Innovation

Methods, ideas, or system contributions that make the work stand out.

Achieved tight lower bound for static retrieval problem

Demonstrated space-time tradeoff for large value sizes

Combined data structures to optimize space and time

🔎 Similar Papers

A Comprehensive Survey on Retrieval Methods in Recommender Systems