Traffic Sign Recognition in Autonomous Driving: Dataset, Benchmark, and Field Experiment

📅 2026-03-24

📈 Citations: 0

✨ Influential: 0

career value

212K/year

🤖 AI Summary

This work addresses key challenges in traffic sign recognition for autonomous driving—namely cross-regional variation, long-tailed category distribution, and semantic ambiguity—by introducing TS-1M, a large-scale dataset comprising over one million images annotated across 454 standardized classes. The study further proposes the first fine-grained diagnostic benchmark tailored for real-world deployment, evaluating cross-regional generalization, rare-class detection, robustness under low-resolution conditions, and semantic comprehension. Through a unified assessment of classic supervised models, self-supervised pre-trained architectures, and multimodal vision-language models (VLMs), the research highlights the critical role of semantic alignment in enhancing model generalization. Experimental results demonstrate that VLMs substantially outperform purely visual models and effectively support map-level decision constraints in real-world road systems.

Technology Category

Application Category

📝 Abstract

Traffic Sign Recognition (TSR) is a core perception capability for autonomous driving, where robustness to cross-region variation, long-tailed categories, and semantic ambiguity is essential for reliable real-world deployment. Despite steady progress in recognition accuracy, existing traffic sign datasets and benchmarks offer limited diagnostic insight into how different modeling paradigms behave under these practical challenges. We present TS-1M, a large-scale and globally diverse traffic sign dataset comprising over one million real-world images across 454 standardized categories, together with a diagnostic benchmark designed to analyze model capability boundaries. Beyond standard train-test evaluation, we provide a suite of challenge-oriented settings, including cross-region recognition, rare-class identification, low-clarity robustness, and semantic text understanding, enabling systematic and fine-grained assessment of modern TSR models. Using TS-1M, we conduct a unified benchmark across three representative learning paradigms: classical supervised models, self-supervised pretrained models, and multimodal vision-language models (VLMs). Our analysis reveals consistent paradigm-dependent behaviors, showing that semantic alignment is a key factor for cross-region generalization and rare-category recognition, while purely visual models remain sensitive to appearance shift and data imbalance. Finally, we validate the practical relevance of TS-1M through real-scene autonomous driving experiments, where traffic sign recognition is integrated with semantic reasoning and spatial localization to support map-level decision constraints. Overall, TS-1M establishes a reference-level diagnostic benchmark for TSR and provides principled insights into robust and semantic-aware traffic sign perception. Project page: https://guoyangzhao.github.io/projects/ts1m.

Problem

Research questions and friction points this paper is trying to address.

Traffic Sign Recognition

cross-region variation

long-tailed categories

semantic ambiguity

diagnostic benchmark

Innovation

Methods, ideas, or system contributions that make the work stand out.

traffic sign recognition

large-scale dataset

diagnostic benchmark