CAAL: Confidence-Aware Active Learning for Heteroscedastic Atmospheric Regression

📅 2026-02-12
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the inefficiency of conventional active learning in atmospheric particulate property regression, which stems from high annotation costs and input-dependent heteroscedastic noise. To overcome this, we propose a Confidence-Aware Active Learning (CAAL) framework that decouples predictive mean estimation from noise modeling, explicitly capturing input-dependent heteroscedasticity. CAAL introduces an acquisition function that dynamically weights epistemic uncertainty by predicted noise levels, thereby leveraging noise as a reliability signal to adaptively guide sample selection. Notably, CAAL is the first method to explicitly disentangle epistemic uncertainty from arbitrary observation noise. Experiments on both particle-resolved simulations and real-world atmospheric observations demonstrate that CAAL significantly outperforms existing active learning baselines, enabling the construction of high-accuracy particulate property databases with substantially reduced annotation budgets.

Technology Category

Application Category

📝 Abstract
Quantifying the impacts of air pollution on health and climate relies on key atmospheric particle properties such as toxicity and hygroscopicity. However, these properties typically require complex observational techniques or expensive particle-resolved numerical simulations, limiting the availability of labeled data. We therefore estimate these hard-to-measure particle properties from routinely available observations (e.g., air pollutant concentrations and meteorological conditions). Because routine observations only indirectly reflect particle composition and structure, the mapping from routine observations to particle properties is noisy and input-dependent, yielding a heteroscedastic regression setting. With a limited and costly labeling budget, the central challenge is to select which samples to measure or simulate. While active learning is a natural approach, most acquisition strategies rely on predictive uncertainty. Under heteroscedastic noise, this signal conflates reducible epistemic uncertainty with irreducible aleatoric uncertainty, causing limited budgets to be wasted in noise-dominated regions. To address this challenge, we propose a confidence-aware active learning framework (CAAL) for efficient and robust sample selection in heteroscedastic settings. CAAL consists of two components: a decoupled uncertainty-aware training objective that separately optimises the predictive mean and noise level to stabilise uncertainty estimation, and a confidence-aware acquisition function that dynamically weights epistemic uncertainty using predicted aleatoric uncertainty as a reliability signal. Experiments on particle-resolved numerical simulations and real atmospheric observations show that CAAL consistently outperforms standard AL baselines. The proposed framework provides a practical and general solution for the efficient expansion of high-cost atmospheric particle property databases.
Problem

Research questions and friction points this paper is trying to address.

heteroscedastic regression
active learning
atmospheric particle properties
label efficiency
uncertainty quantification
Innovation

Methods, ideas, or system contributions that make the work stand out.

active learning
heteroscedastic regression
uncertainty quantification
confidence-aware acquisition
atmospheric particle properties
🔎 Similar Papers
No similar papers found.
F
Fei Jiang
Earth and Environmental Sciences, The University of Manchester
J
Jiyang Xia
Earth and Environmental Sciences, The University of Manchester
Junjie Yu
Junjie Yu
Southern University of Science and Technology
Deep LearningNeuroscience
Mingfei Sun
Mingfei Sun
Lecturer (Assistant Professor), University of Manchester
Reinforcement LearningGenerative ModelsHuman-Robot Interaction
Hugh Coe
Hugh Coe
Professor of Atmospheric Composition, University of Manchester
atmospheric compositionaerosolpollutionmeasurements
D
David Topping
Earth and Environmental Sciences, The University of Manchester
D
Dantong Liu
Department of Atmospheric Sciences, Zhejiang University
Zhenhui Jessie Li
Zhenhui Jessie Li
Yunqi Academy of Engineering
city intelligence
Z
Zhonghua Zheng
Earth and Environmental Sciences, The University of Manchester