A Large-Scale Dataset and Reproducible Framework for RF Fingerprinting on IEEE 802.11g Same-Model Devices

📅 2025-11-11
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Addressing challenges in RF fingerprinting for identical-device identification—including limited data volume, irreproducible models, and subtle hardware variations—this work introduces the first large-scale IEEE 802.11g same-model device dataset, comprising 123 devices, 35.42 million raw I/Q samples, and 1.85 million RF features, alongside an open-source, end-to-end reproducible experimental framework. We propose a novel I/Q signal analysis and robust feature extraction method tailored to minute hardware discrepancies, integrated with a random forest classifier. Our approach achieves 89.06% device identification accuracy on the dataset, demonstrating both feature discriminability and cross-device stability. This work establishes a high-quality benchmark dataset, a standardized reproducible evaluation paradigm, and a practical technical pathway for RF fingerprinting—significantly enhancing feasibility and fairness of device identification in complex real-world scenarios.

Technology Category

Application Category

📝 Abstract
Radio frequency (RF) fingerprinting exploits hardware imperfections for device identification, but distinguishing between same-model devices remains challenging due to their minimal hardware variations. Existing datasets for RF fingerprinting are constrained by small device scales and heterogeneous models, which hinders robust training and fair evaluation for machine learning models. To address this gap, we introduce a large-scale dataset of same-model devices along with a fully reproducible, open-source experimental framework. The dataset is built using 123 identical commercial IEEE 802.11g devices and contains 35.42 million raw I/Q samples from the preambles and corresponding 1.85 million RF features. The open-source framework further ensures full reproducibility from data collection to final evaluation. Within this framework, a Random Forest-based algorithm is proposed to achieve 89.06% identification accuracy on this dataset. Extensive experimental evaluations further confirm the relationships between the extracted features.
Problem

Research questions and friction points this paper is trying to address.

Distinguishing same-model wireless devices using RF fingerprinting techniques
Addressing limitations of small datasets with heterogeneous device models
Providing reproducible framework for RF fingerprinting evaluation and validation
Innovation

Methods, ideas, or system contributions that make the work stand out.

Large-scale dataset of identical commercial devices
Fully reproducible open-source experimental framework
Random Forest algorithm achieving high identification accuracy
🔎 Similar Papers
No similar papers found.
Z
Zewei Guo
Future University Hakodate, Hakodate, 041-8655, Hokkaido, Japan
Z
Zhen Jia
Keio University, Fujisawa, 108-8345, Kanagawa, Japan
Jinxiao Zhu
Jinxiao Zhu
Tokyo Denki University, Tokyo, 120-8551, Tokyo, Japan
W
Wenhao Huang
Keio University, Fujisawa, 108-8345, Kanagawa, Japan
Yin Chen
Yin Chen
Lecturer in Mathematics at University of Saskatchewan
Invariant theoryLie theoryCommutative algebraApplied algebraic geometry