🤖 AI Summary
Systematic pixel missingness in satellite remote sensing imagery—induced by cloud cover and governed by a Missing-Not-At-Random (MNAR) mechanism—introduces representation bias in methane plume detection models, which erroneously associate “effective pixel coverage” with ground-truth labels, leading to severe false negatives under low-coverage conditions.
Method: We propose a coverage-balanced weighted resampling strategy to decouple coverage from label dependence, integrated with a multimodal imputation method specifically tailored to the MNAR missingness pattern.
Contribution/Results: The resulting debiased training framework preserves accuracy, recall, and F1-score while significantly reducing representation bias. On real-world low-coverage scenes, it improves plume detection rate by 23.6%. To our knowledge, this is the first work to jointly leverage coverage-aware resampling and MNAR-adapted imputation for methane remote sensing detection, thereby enhancing model robustness and generalization under complex cloud conditions.
📝 Abstract
Most satellite images have systematically missing pixels (i.e., missing data not at random (MNAR)) due to factors such as clouds. If not addressed, these missing pixels can lead to representation bias in automated feature extraction models. In this work, we show that spurious association between the label and the number of missing values in methane plume detection can cause the model to associate the coverage (i.e., the percentage of valid pixels in an image) with the label, subsequently under-detecting plumes in low-coverage images. We evaluate multiple imputation approaches to remove the dependence between the coverage and a label. Additionally, we propose a weighted resampling scheme during training that removes the association between the label and the coverage by enforcing class balance in each coverage bin. Our results show that both resampling and imputation can significantly reduce the representation bias without hurting balanced accuracy, precision, or recall. Finally, we evaluate the capability of the debiased models using these techniques in an operational scenario and demonstrate that the debiased models have a higher chance of detecting plumes in low-coverage images.