On the Power of Learning-Augmented Search Trees

📅 2022-11-16

📈 Citations: 1

✨ Influential: 1

🤖 AI Summary

Existing learned augmented binary search trees (BSTs) are restricted to Zipfian access distributions and lack robustness to prediction errors, dynamic updates, and general access patterns. To address these limitations, we propose Pred-Treap, a prediction-augmented Treap whose node priorities are set as $-lfloor log log(1/w_x) floor + U(0,1)$, where $w_x$ is the predicted access weight of key $x$. This composite priority ensures that node depth is primarily governed by predicted weights while preserving randomness for structural stability. Pred-Treap is the first learned BST variant provably optimal under arbitrary access distributions, simultaneously guaranteeing static optimality, the working-set property, online dynamic updates, and robustness to prediction inaccuracies. It naturally generalizes to B-Treaps for external memory. We provide theoretical bounds proving its static optimality and working-set guarantee. Empirical evaluation demonstrates that Pred-Treap consistently outperforms classical BSTs and the ICML’22 baseline across diverse synthetic and real-world access distributions.

📝 Abstract

We study learning-augmented binary search trees (BSTs) via Treaps with carefully designed priorities. The result is a simple search tree in which the depth of each item $x$ is determined by its predicted weight $w_x$. Specifically, each item $x$ is assigned a composite priority of $-lfloorloglog(1/w_x) floor + U(0, 1)$ where $U(0, 1)$ is the uniform random variable. By choosing $w_x$ as the relative frequency of $x$, the resulting search trees achieve static optimality. This approach generalizes the recent learning-augmented BSTs [Lin-Luo-Woodruff ICML '22], which only work for Zipfian distributions, by extending them to arbitrary input distributions. Furthermore, we demonstrate that our method can be generalized to a B-Tree data structure using the B-Treap approach [Golovin ICALP '09]. Our search trees are also capable of leveraging localities in the access sequence through online self-reorganization, thereby achieving the working-set property. Additionally, they are robust to prediction errors and support dynamic operations, such as insertions, deletions, and prediction updates. We complement our analysis with an empirical study, demonstrating that our method outperforms prior work and classic data structures.

Problem

Research questions and friction points this paper is trying to address.

Enhancing BSTs with learning-augmented priorities for optimal depth.

Generalizing learning-augmented BSTs to arbitrary input distributions.

Achieving dynamic operations and robustness to prediction errors.

Innovation

Methods, ideas, or system contributions that make the work stand out.

Uses Treaps with composite priorities for depth control

Generalizes learning-augmented BSTs to arbitrary distributions

Supports dynamic operations and online self-reorganization

🔎 Similar Papers

No similar papers found.

Authors to Follow