🤖 AI Summary
Low-cost IoT air quality sensors suffer from poor generalizability and heavy reliance on abundant reference sensors when deployed across cities, due to spatiotemporal distribution shifts induced by traffic, meteorology, and other factors. Method: This paper pioneers the integration of unsupervised domain adaptation (UDA) into graph-structured spatiotemporal regression. We propose a subspace alignment method based on the Tikhonov-regularized least-squares closed-form solution, enhanced with Cholesky decomposition and power iteration for efficient, scalable domain alignment. This is coupled with a spatiotemporal graph neural network (STGNN) to capture local spatiotemporal dynamics. Contribution/Results: Evaluated on multi-city real-world datasets, our approach significantly improves PM₂.₅ and NO₂ prediction accuracy—reducing MAE by 32.7%—and enables zero-shot cross-city calibration with only a few reference sensors, achieving both strong generalization and practical deployability.
📝 Abstract
The deployment of affordable Internet of Things (IoT) sensors for air pollution monitoring has increased in recent years due to their scalability and cost-effectiveness. However, accurately calibrating these sensors in uncontrolled environments remains a significant challenge. While expensive reference sensors can provide accurate ground truth data, they are often deployed on a limited scale due to high costs, leading to a scarcity of labeled data. In diverse urban environments, data distributions constantly shift due to varying factors such as traffic patterns, industrial activities, and weather conditions, which impact sensor readings. Consequently, traditional machine learning models -- despite their increasing deployment for environmental sensor calibration -- often struggle to provide reliable pollutant measurements across different locations due to domain shifts. To address these challenges, we propose a novel unsupervised domain adaptation (UDA) method specifically tailored for regression tasks on graph-structured data. Our approach leverages Graph Neural Networks (GNNs) to model the relationships between sensors. To effectively capture critical spatial-temporal interactions, we incorporate spatial-temporal graph neural networks (STGNNs), which extend GNNs by incorporating temporal dynamics. To handle the resulting larger embeddings, we propose a domain adaptation method using a closed-form solution inspired by the Tikhonov-regularized least-squares problem. This method leverages Cholesky decomposition and power iteration to align the subspaces between source and target domains. By aligning these subspaces, our approach allows low-cost IoT sensors to learn calibration parameters from expensive reference sensors. This facilitates reliable pollutant measurements in new locations without the need for additional costly equipment.