π€ AI Summary
Distributed Hash Tables (DHTs) natively lack efficient range query support, limiting their applicability in large language model (LLM) serving, distributed databases, and blockchain systems. This paper proposes LearnDHTβthe first DHT design that deeply integrates recursive machine learning models into its core architecture while preserving key-space ordering to enable low-overhead range queries. Key contributions include: (1) a learned order-preserving hash mapping; (2) a distributed recursive indexing scheme; (3) low-latency adaptive routing; and (4) an elastic network adaptation mechanism. Extensive experiments on real-world testbeds and large-scale simulations demonstrate that LearnDHT reduces query latency and communication overhead by 80β90% compared to state-of-the-art approaches, while exhibiting superior scalability and robustness under dynamic network conditions.
π Abstract
Distributed Hash Tables (DHTs) are pivotal in numerous high-impact key-value applications built on distributed networked systems, offering a decentralized architecture that avoids single points of failure and improves data availability. Despite their widespread utility, DHTs face substantial challenges in handling range queries, which are crucial for applications such as LLM serving, distributed storage, databases, content delivery networks, and blockchains. To address this limitation, we present LEAD, a novel system incorporating learned models within DHT structures to significantly optimize range query performance. LEAD utilizes a recursive machine learning model to map and retrieve data across a distributed system while preserving the inherent order of data. LEAD includes the designs to minimize range query latency and message cost while maintaining high scalability and resilience to network churn. Our comprehensive evaluations, conducted in both testbed implementation and simulations, demonstrate that LEAD achieves tremendous advantages in system efficiency compared to existing range query methods in large-scale distributed systems, reducing query latency and message cost by 80% to 90%+. Furthermore, LEAD exhibits remarkable scalability and robustness against system churn, providing a robust, scalable solution for efficient data retrieval in distributed key-value systems.