A Node-Aware Dynamic Quantization Approach for Graph Collaborative Filtering

๐Ÿ“… 2025-08-22
๐Ÿ“ˆ Citations: 0
โœจ Influential: 0
๐Ÿ“„ PDF
๐Ÿค– AI Summary
Deploying graph neural networks (GNNs) on edge devices faces challenges including high parameter overhead, excessive computational cost, and structural information loss caused by conventional quantization methods that ignore graph topology. To address these issues, this paper proposes a node-aware dynamic quantization framework. It adaptively adjusts quantization scales per node based on the user-item interaction graph structure, dynamically refines quantization ranges via message passing, and introduces a graph-relational gradient estimation mechanism to enhance training stability. Under 2-bit low-precision quantization, the method achieves 8โ€“12ร— model compression and 2ร— faster training speed. Empirical results demonstrate average improvements of 27.8% and 17.6% in Recall@10 and NDCG@10, respectively, over state-of-the-art methodsโ€”matching the performance of full-precision models while significantly reducing resource requirements for edge deployment.

Technology Category

Application Category

๐Ÿ“ Abstract
In the realm of collaborative filtering recommendation systems, Graph Neural Networks (GNNs) have demonstrated remarkable performance but face significant challenges in deployment on resource-constrained edge devices due to their high embedding parameter requirements and computational costs. Using common quantization method directly on node embeddings may overlooks their graph based structure, causing error accumulation during message passing and degrading the quality of quantized embeddings.To address this, we propose Graph based Node-Aware Dynamic Quantization training for collaborative filtering (GNAQ), a novel quantization approach that leverages graph structural information to enhance the balance between efficiency and accuracy of GNNs for Top-K recommendation. GNAQ introduces a node-aware dynamic quantization strategy that adapts quantization scales to individual node embeddings by incorporating graph interaction relationships. Specifically, it initializes quantization intervals based on node-wise feature distributions and dynamically refines them through message passing in GNN layers. This approach mitigates information loss caused by fixed quantization scales and captures hierarchical semantic features in user-item interaction graphs. Additionally, GNAQ employs graph relation-aware gradient estimation to replace traditional straight-through estimators, ensuring more accurate gradient propagation during training. Extensive experiments on four real-world datasets demonstrate that GNAQ outperforms state-of-the-art quantization methods, including BiGeaR and N2UQ, by achieving average improvement in 27.8% Recall@10 and 17.6% NDCG@10 under 2-bit quantization. In particular, GNAQ is capable of maintaining the performance of full-precision models while reducing their model sizes by 8 to 12 times; in addition, the training time is twice as fast compared to quantization baseline methods.
Problem

Research questions and friction points this paper is trying to address.

Addressing quantization errors in graph neural networks for recommendations
Balancing efficiency and accuracy in resource-constrained edge devices
Preserving graph structure information during embedding quantization process
Innovation

Methods, ideas, or system contributions that make the work stand out.

Node-aware dynamic quantization adapting to individual embeddings
Graph interaction relationships refine quantization intervals dynamically
Graph relation-aware gradient estimation replaces straight-through estimators
๐Ÿ”Ž Similar Papers
No similar papers found.
L
Lin Li
Wuhan University of Technology, Wuhan, China
Chunyang Li
Chunyang Li
MPhil in CSE, HKUST
Natural Language Processing
Y
Yu Yin
Wuhan University of Technology, Wuhan, China; Huawei Technologies Co., Ltd, Shanghai, China
Xiaohui Tao
Xiaohui Tao
Full Professor, University of Southern Queensland, Australia
Artificial Intelligencedata miningmachine learningnatural language processingknowledge
J
Jianwei Zhang
Iwate University, Morioka, Japan