🤖 AI Summary
Traditional process-based models exhibit poor generalizability for maize yield prediction under drought stress, while machine learning models suffer from limited interpretability and systematic overestimation of yields.
Method: We propose KGML-SM—a knowledge-guided machine learning framework that explicitly models soil moisture as a physics-informed intermediate variable, integrating multisource remote sensing data with agronomic priors. It incorporates a drought-aware loss function penalizing prediction errors in water-stressed regions and employs feature importance analysis coupled with spatiotemporal attribution to quantify soil moisture’s regional and temporal regulatory effects on yield.
Results: Experiments demonstrate that KGML-SM reduces prediction error by 18.7% in drought-affected areas, significantly outperforming both conventional machine learning and process-based models. The framework achieves a balanced trade-off between physical interpretability—rooted in soil–plant–atmosphere continuum principles—and predictive robustness under climate variability.
📝 Abstract
Remote sensing (RS) techniques, by enabling non-contact acquisition of extensive ground observations, have become a valuable tool for corn yield prediction. Traditional process-based (PB) models are limited by fixed input features and struggle to incorporate large volumes of RS data. In contrast, machine learning (ML) models are often criticized for being ``black boxes'' with limited interpretability. To address these limitations, we used Knowledge-Guided Machine Learning (KGML), which combined the strengths of both approaches and fully used RS data. However, previous KGML methods overlooked the crucial role of soil moisture in plant growth. To bridge this gap, we proposed the Knowledge-Guided Machine Learning with Soil Moisture (KGML-SM) framework, using soil moisture as an intermediate variable to emphasize its key role in plant development. Additionally, based on the prior knowledge that the model may overestimate under drought conditions, we designed a drought-aware loss function that penalizes predicted yield in drought-affected areas. Our experiments showed that the KGML-SM model outperformed other ML models. Finally, we explored the relationships between drought, soil moisture, and corn yield prediction, assessing the importance of various features and analyzing how soil moisture impacts corn yield predictions across different regions and time periods.