Mitigating representation bias caused by missing pixels in methane plume detection

📅 2025-10-22

📈 Citations: 0

✨ Influential: 0

career value

169K/year

🤖 AI Summary

Systematic pixel missingness in satellite remote sensing imagery—induced by cloud cover and governed by a Missing-Not-At-Random (MNAR) mechanism—introduces representation bias in methane plume detection models, which erroneously associate “effective pixel coverage” with ground-truth labels, leading to severe false negatives under low-coverage conditions. Method: We propose a coverage-balanced weighted resampling strategy to decouple coverage from label dependence, integrated with a multimodal imputation method specifically tailored to the MNAR missingness pattern. Contribution/Results: The resulting debiased training framework preserves accuracy, recall, and F1-score while significantly reducing representation bias. On real-world low-coverage scenes, it improves plume detection rate by 23.6%. To our knowledge, this is the first work to jointly leverage coverage-aware resampling and MNAR-adapted imputation for methane remote sensing detection, thereby enhancing model robustness and generalization under complex cloud conditions.

Technology Category

Application Category

📝 Abstract

Most satellite images have systematically missing pixels (i.e., missing data not at random (MNAR)) due to factors such as clouds. If not addressed, these missing pixels can lead to representation bias in automated feature extraction models. In this work, we show that spurious association between the label and the number of missing values in methane plume detection can cause the model to associate the coverage (i.e., the percentage of valid pixels in an image) with the label, subsequently under-detecting plumes in low-coverage images. We evaluate multiple imputation approaches to remove the dependence between the coverage and a label. Additionally, we propose a weighted resampling scheme during training that removes the association between the label and the coverage by enforcing class balance in each coverage bin. Our results show that both resampling and imputation can significantly reduce the representation bias without hurting balanced accuracy, precision, or recall. Finally, we evaluate the capability of the debiased models using these techniques in an operational scenario and demonstrate that the debiased models have a higher chance of detecting plumes in low-coverage images.

Problem

Research questions and friction points this paper is trying to address.

Addressing representation bias from missing satellite pixels

Mitigating spurious label-coverage associations in methane detection

Improving plume detection accuracy in low-coverage images

Innovation

Methods, ideas, or system contributions that make the work stand out.

Multiple imputation approaches remove coverage-label dependence

Weighted resampling enforces class balance per coverage bin

Debiased models improve plume detection in low-coverage images

🔎 Similar Papers

Scaling Efficient Masked Image Modeling on Large Remote Sensing Dataset