🤖 AI Summary
This work addresses the limitation of existing VQ-VAE-based semantic communication systems, which rely on fixed-size codebooks and struggle to achieve fine-grained rate adaptation without retraining. To overcome this, the authors propose a zero-shot dynamic codebook scaling method that constructs a semantic hierarchy via hyperbolic space embeddings, builds a semantic tree using a minimum spanning tree algorithm, and iteratively prunes leaf nodes to adjust the codebook size on the fly. This approach enables arbitrary bitrate adaptation without any retraining, achieving reconstruction quality comparable to specialized models while significantly reducing computational overhead. Notably, it maintains strong robustness even at extremely low bitrates, marking the first solution to realize flexible, training-free rate adaptation in semantic communication systems.
📝 Abstract
Digital semantic communication systems, which often leverage the Vector Quantized Variational Autoencoder (VQ-VAE) framework, are pivotal for future wireless networks. In a VQ-VAE-based semantic communication system, the transmission rate is directly governed by the size of a discrete codebook known as knowledge base (KB). However, the KB size is a fixed hyperparameter, meaning that adapting the rate requires training and storing a separate model for each desired size -- a practice that is too computationally and storage-prohibitive to achieve truly granular rate control. To address this, we introduce a principled, zero-shot KB resizing method that enables on-the-fly rate adaptation without any retraining. Our approach establishes a global importance ranking for all vectors within a single, large parent KB by uncovering its inherent semantic hierarchy. This is achieved via a three-step framework: 1) embedding KB vectors into hyperbolic space to reveal their hierarchical relationships; 2) constructing a master semantic tree using a minimum spanning tree algorithm; 3) enabling instant resizing by iteratively pruning the least important leaf nodes. Extensive simulations demonstrate that our method achieves reconstruction quality nearly identical to that of dedicated KBs trained from scratch, while demanding only a fraction of the computational budget. Moreover, our approach exhibits superior robustness at very low rates, where conventional KBs suffer from catastrophic failure. Our work resolves a fundamental limitation of VQ-VAE-based semantic communication systems, offering a practical and efficient path toward flexible and rate-adaptive semantic communication.