RadioFormer: A Multiple-Granularity Radio Map Estimation Transformer with 1 extpertenthousand Spatial Sampling

📅 2025-04-27
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses radio map estimation under extremely sparse spatial sampling (only 0.01% of pixels observable) in realistic scenarios. We propose a multi-granularity Transformer architecture featuring two novel attention mechanisms: dual-stream self-attention (DSA) to model fine-grained pixel-level signal correlations, and cross-stream cross-attention (CCA) to capture coarse-grained building geometry at the block level; these are jointly optimized via multi-scale feature fusion. The architecture enables zero-shot generalization and maintains robustness even at ultra-low sampling rates. Evaluated on the RadioMapSeer benchmark, our method achieves state-of-the-art accuracy while incurring the lowest computational cost—demonstrating superior trade-offs among reconstruction fidelity, inference efficiency, and cross-scenario generalizability.

Technology Category

Application Category

📝 Abstract
The task of radio map estimation aims to generate a dense representation of electromagnetic spectrum quantities, such as the received signal strength at each grid point within a geographic region, based on measurements from a subset of spatially distributed nodes (represented as pixels). Recently, deep vision models such as the U-Net have been adapted to radio map estimation, whose effectiveness can be guaranteed with sufficient spatial observations (typically 0.01% to 1% of pixels) in each map, to model local dependency of observed signal power. However, such a setting of sufficient measurements can be less practical in real-world scenarios, where extreme sparsity in spatial sampling can be widely encountered. To address this challenge, we propose RadioFormer, a novel multiple-granularity transformer designed to handle the constraints posed by spatial sparse observations. Our RadioFormer, through a dual-stream self-attention (DSA) module, can respectively discover the correlation of pixel-wise observed signal power and also learn patch-wise buildings' geometries in a style of multiple granularities, which are integrated into multi-scale representations of radio maps by a cross stream cross-attention (CCA) module. Extensive experiments on the public RadioMapSeer dataset demonstrate that RadioFormer outperforms state-of-the-art methods in radio map estimation while maintaining the lowest computational cost. Furthermore, the proposed approach exhibits exceptional generalization capabilities and robust zero-shot performance, underscoring its potential to advance radio map estimation in a more practical setting with very limited observation nodes.
Problem

Research questions and friction points this paper is trying to address.

Estimating dense radio maps from extremely sparse spatial sampling
Overcoming limitations of deep vision models with insufficient measurements
Integrating pixel-wise and patch-wise signal correlations for multi-scale estimation
Innovation

Methods, ideas, or system contributions that make the work stand out.

Multiple-granularity transformer for sparse observations
Dual-stream self-attention for signal and geometry
Cross-attention integrates multi-scale radio maps
🔎 Similar Papers
Z
Zheng Fang
Pengcheng Laboratory, China, and also with the Department of Computer Science and Engineering, Southern University of Science and Technology, Shenzhen, China
K
Kangjun Liu
Pengcheng Laboratory, China
K
Ke Chen
Pengcheng Laboratory, China
Qingyu Liu
Qingyu Liu
Electronic and Computer Engineering, Peking University
wireless networkingmobile networkinginternet of thingsintelligent transportation
J
Jianguo Zhang
Department of Computer Science and Engineering, Southern University of Science and Technology, Shenzhen, China
Lingyang Song
Lingyang Song
Peking University & Peng Cheng Laboratory, China
Wireless CommunicationsMobile ComputingMachine LearningGame Theory
Yaowei Wang
Yaowei Wang
The Hong Kong Polytechnic University