A Node-Aware Dynamic Quantization Approach for Graph Collaborative Filtering

📅 2025-08-22

📈 Citations: 0

✨ Influential: 0

career value

204K/year

🤖 AI Summary

Deploying graph neural networks (GNNs) on edge devices faces challenges including high parameter overhead, excessive computational cost, and structural information loss caused by conventional quantization methods that ignore graph topology. To address these issues, this paper proposes a node-aware dynamic quantization framework. It adaptively adjusts quantization scales per node based on the user-item interaction graph structure, dynamically refines quantization ranges via message passing, and introduces a graph-relational gradient estimation mechanism to enhance training stability. Under 2-bit low-precision quantization, the method achieves 8–12× model compression and 2× faster training speed. Empirical results demonstrate average improvements of 27.8% and 17.6% in Recall@10 and NDCG@10, respectively, over state-of-the-art methods—matching the performance of full-precision models while significantly reducing resource requirements for edge deployment.

Technology Category

Application Category

📝 Abstract

In the realm of collaborative filtering recommendation systems, Graph Neural Networks (GNNs) have demonstrated remarkable performance but face significant challenges in deployment on resource-constrained edge devices due to their high embedding parameter requirements and computational costs. Using common quantization method directly on node embeddings may overlooks their graph based structure, causing error accumulation during message passing and degrading the quality of quantized embeddings.To address this, we propose Graph based Node-Aware Dynamic Quantization training for collaborative filtering (GNAQ), a novel quantization approach that leverages graph structural information to enhance the balance between efficiency and accuracy of GNNs for Top-K recommendation. GNAQ introduces a node-aware dynamic quantization strategy that adapts quantization scales to individual node embeddings by incorporating graph interaction relationships. Specifically, it initializes quantization intervals based on node-wise feature distributions and dynamically refines them through message passing in GNN layers. This approach mitigates information loss caused by fixed quantization scales and captures hierarchical semantic features in user-item interaction graphs. Additionally, GNAQ employs graph relation-aware gradient estimation to replace traditional straight-through estimators, ensuring more accurate gradient propagation during training. Extensive experiments on four real-world datasets demonstrate that GNAQ outperforms state-of-the-art quantization methods, including BiGeaR and N2UQ, by achieving average improvement in 27.8% Recall@10 and 17.6% NDCG@10 under 2-bit quantization. In particular, GNAQ is capable of maintaining the performance of full-precision models while reducing their model sizes by 8 to 12 times; in addition, the training time is twice as fast compared to quantization baseline methods.

Problem

Research questions and friction points this paper is trying to address.

Addressing quantization errors in graph neural networks for recommendations

Balancing efficiency and accuracy in resource-constrained edge devices

Preserving graph structure information during embedding quantization process

Innovation

Methods, ideas, or system contributions that make the work stand out.

Node-aware dynamic quantization adapting to individual embeddings

Graph interaction relationships refine quantization intervals dynamically

Graph relation-aware gradient estimation replaces straight-through estimators

🔎 Similar Papers

No similar papers found.