🤖 AI Summary
Existing graph tokenization methods exhibit limitations in hierarchical structure modeling and task adaptability: quantization strategies are often fixed or task-agnostic, leading to imbalanced structural representation and hindering dynamic multi-scale contribution adjustment without retraining the encoder. This paper proposes HQ-Graph, a Hierarchical Quantization Graph tokenization framework that enables dynamic multi-scale graph structural aggregation under a frozen encoder via a lightweight self-weighted gating mechanism, supporting task-adaptive discrete representation learning. Its core innovation lies in the organic integration of hierarchical quantization, discrete representation, learnable gating, and multi-scale aggregation. Experiments demonstrate that HQ-Graph consistently outperforms strong baselines on node classification and link prediction tasks, achieving superior performance at comparable computational cost—thereby balancing expressive power and parameter efficiency.
📝 Abstract
Recent progress in language and vision foundation models demonstrates the importance of discrete token interfaces that transform complex inputs into compact sequences for large-scale modeling. Extending this paradigm to graphs requires a tokenization scheme that handles non-Euclidean structures and multi-scale dependencies efficiently. Existing approaches to graph tokenization, linearized, continuous, and quantized, remain limited in adaptability and efficiency. In particular, most current quantization-based tokenizers organize hierarchical information in fixed or task-agnostic ways, which may either over-represent or under-utilize structural cues, and lack the ability to dynamically reweight contributions from different levels without retraining the encoder. This work presents a hierarchical quantization framework that introduces a self-weighted mechanism for task-adaptive aggregation across multiple scales. The proposed method maintains a frozen encoder while modulating information flow through a lightweight gating process, enabling parameter-efficient adaptation to diverse downstream tasks. Experiments on benchmark datasets for node classification and link prediction demonstrate consistent improvements over strong baselines under comparable computational budgets.