NeurStore: Efficient In-database Deep Learning Model Management System

๐Ÿ“… 2025-09-03
๐Ÿ“ˆ Citations: 0
โœจ Influential: 0
๐Ÿ“„ PDF
๐Ÿค– AI Summary
To address the substantial storage overhead and slow loading times of deep learning models in database-integrated AI analytics, this paper proposes a fine-grained tensor management framework. The method comprises three key components: (1) an enhanced Hierarchical Navigable Small World (HNSW) graph for efficient tensor indexing; (2) tensor-level deduplication and incremental quantization compression, achieving high compression ratios under bounded accuracy degradation; and (3) a tensor-native storage engine enabling direct computation on compressed tensors and compression-aware model loading. Experimental evaluation demonstrates that, compared to state-of-the-art approaches, the framework achieves significantly higher compression ratios while maintaining competitive model loading throughputโ€”thus jointly optimizing storage efficiency and execution performance.

Technology Category

Application Category

๐Ÿ“ Abstract
With the prevalence of in-database AI-powered analytics, there is an increasing demand for database systems to efficiently manage the ever-expanding number and size of deep learning models. However, existing database systems typically store entire models as monolithic files or apply compression techniques that overlook the structural characteristics of deep learning models, resulting in suboptimal model storage overhead. This paper presents NeurStore, a novel in-database model management system that enables efficient storage and utilization of deep learning models. First, NeurStore employs a tensor-based model storage engine to enable fine-grained model storage within databases. In particular, we enhance the hierarchical navigable small world (HNSW) graph to index tensors, and only store additional deltas for tensors within a predefined similarity threshold to ensure tensor-level deduplication. Second, we propose a delta quantization algorithm that effectively compresses delta tensors, thus achieving a superior compression ratio with controllable model accuracy loss. Finally, we devise a compression-aware model loading mechanism, which improves model utilization performance by enabling direct computation on compressed tensors. Experimental evaluations demonstrate that NeurStore achieves superior compression ratios and competitive model loading throughput compared to state-of-the-art approaches.
Problem

Research questions and friction points this paper is trying to address.

Efficiently managing expanding deep learning models in databases
Reducing storage overhead by leveraging model structural characteristics
Enabling direct computation on compressed tensors for performance
Innovation

Methods, ideas, or system contributions that make the work stand out.

Tensor-based storage engine for fine-grained model management
Delta quantization algorithm for superior compression ratio
Compression-aware loading mechanism for direct computation
๐Ÿ”Ž Similar Papers
No similar papers found.
S
Siqi Xiang
National University of Singapore
S
Sheng Wang
Alibaba Group
Xiaokui Xiao
Xiaokui Xiao
National University of Singapore
DatabasesData ManagementData Privacy
Cong Yue
Cong Yue
National University of Singapore
database systemdata analyticsdata security
Z
Zhanhao Zhao
National University of Singapore
B
Beng Chin Ooi
Zhejiang University