Enhancing Strawberry Yield Forecasting with Backcasted IoT Sensor Data and Machine Learning

📅 2025-04-25
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address the challenge of cross-seasonal IoT sensor data scarcity in strawberry cultivation—which severely limits yield prediction model training—this paper proposes a backcasting paradigm tailored to agricultural small-sample scenarios. We integrate two years of multimodal greenhouse environmental measurements (temperature, humidity, soil moisture, photosynthetically active radiation, and irrigation volume) with four years of manual yield records and external weather station data to construct a temporal joint generative model that faithfully reconstructs missing sensor data for periods when sensors were not deployed. The synthetically generated data significantly improve the performance of XGBoost and LSTM regression models: mean absolute error (MAE) decreases by 23.6% compared to baseline models relying solely on historical yield and meteorological data. This work represents the first systematic application of backcasting to mitigate IoT data insufficiency in agriculture, providing a reproducible technical pathway and methodological foundation for small-sample smart farming.

Technology Category

Application Category

📝 Abstract
Due to rapid population growth globally, digitally-enabled agricultural sectors are crucial for sustainable food production and making informed decisions about resource management for farmers and various stakeholders. The deployment of Internet of Things (IoT) technologies that collect real-time observations of various environmental (e.g., temperature, humidity, etc.) and operational factors (e.g., irrigation) influencing production is often seen as a critical step to enable additional novel downstream tasks, such as AI-based yield forecasting. However, since AI models require large amounts of data, this creates practical challenges in a real-world dynamic farm setting where IoT observations would need to be collected over a number of seasons. In this study, we deployed IoT sensors in strawberry production polytunnels for two growing seasons to collect environmental data, including water usage, external and internal temperature, external and internal humidity, soil moisture, soil temperature, and photosynthetically active radiation. The sensor observations were combined with manually provided yield records spanning a period of four seasons. To bridge the gap of missing IoT observations for two additional seasons, we propose an AI-based backcasting approach to generate synthetic sensor observations using historical weather data from a nearby weather station and the existing polytunnel observations. We built an AI-based yield forecasting model to evaluate our approach using the combination of real and synthetic observations. Our results demonstrated that incorporating synthetic data improved yield forecasting accuracy, with models incorporating synthetic data outperforming those trained only on historical yield, weather records, and real sensor data.
Problem

Research questions and friction points this paper is trying to address.

Enhancing strawberry yield forecasting using IoT and machine learning
Bridging data gaps with AI-based backcasting of sensor observations
Improving forecasting accuracy by combining real and synthetic data
Innovation

Methods, ideas, or system contributions that make the work stand out.

AI-based backcasting for synthetic sensor data
IoT sensors collect environmental and operational data
Machine learning model improves yield forecasting accuracy
🔎 Similar Papers
No similar papers found.
T
T. Ayall
Interdisciplinary Institute and Department of Computing Science, University of Aberdeen, UK
Andy Li
Andy Li
Monash University
MAPF
M
Matthew Beddows
Interdisciplinary Institute and Department of Computing Science, University of Aberdeen, UK
Milan Markovic
Milan Markovic
Interdisciplinary Fellow in Data & AI - University of Aberdeen, UK
AccountabilityComplianceTransparencyProvenanceSemantic Web
G
G. Leontidis
Interdisciplinary Institute and Department of Computing Science, University of Aberdeen, UK