Lightweight Road Environment Segmentation using Vector Quantization

📅 2025-04-19
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address the challenge that conventional FCN- and Transformer-based models struggle to model discrete semantic structures in autonomous driving scene segmentation, this paper proposes VQ-MobileUNETR—a lightweight segmentation framework enhanced with vector quantization (VQ). It is the first work to integrate a VQ mechanism into the MobileUNETR architecture, discretizing continuous feature representations into compact codebook vectors without increasing model parameters or computational overhead. This enables improved feature discriminability, robustness to noise, and stronger semantic structuring. Evaluated on Cityscapes, VQ-MobileUNETR achieves 77.0% mIoU—outperforming the baseline by 2.9%—demonstrating that discrete representation significantly enhances semantic clustering capability. This work provides a novel, efficient paradigm for structured visual representation learning in real-time autonomous driving systems.

Technology Category

Application Category

📝 Abstract
Road environment segmentation plays a significant role in autonomous driving. Numerous works based on Fully Convolutional Networks (FCNs) and Transformer architectures have been proposed to leverage local and global contextual learning for efficient and accurate semantic segmentation. In both architectures, the encoder often relies heavily on extracting continuous representations from the image, which limits the ability to represent meaningful discrete information. To address this limitation, we propose segmentation of the autonomous driving environment using vector quantization. Vector quantization offers three primary advantages for road environment segmentation. (1) Each continuous feature from the encoder is mapped to a discrete vector from the codebook, helping the model discover distinct features more easily than with complex continuous features. (2) Since a discrete feature acts as compressed versions of the encoder's continuous features, they also compress noise or outliers, enhancing the image segmentation task. (3) Vector quantization encourages the latent space to form coarse clusters of continuous features, forcing the model to group similar features, making the learned representations more structured for the decoding process. In this work, we combined vector quantization with the lightweight image segmentation model MobileUNETR and used it as a baseline model for comparison to demonstrate its efficiency. Through experiments, we achieved 77.0 % mIoU on Cityscapes, outperforming the baseline by 2.9 % without increasing the model's initial size or complexity.
Problem

Research questions and friction points this paper is trying to address.

Improving road segmentation using vector quantization
Enhancing discrete feature representation in segmentation models
Reducing noise in segmentation with compressed discrete features
Innovation

Methods, ideas, or system contributions that make the work stand out.

Vector quantization for discrete feature mapping
Noise compression via discrete feature representation
Latent space clustering for structured decoding
🔎 Similar Papers
No similar papers found.