🤖 AI Summary
This study addresses the challenge of predicting mycotoxin contamination in Irish oats to enable pre-harvest interventions that mitigate economic losses.
Method: We developed a multi-output machine learning framework for simultaneous prediction of multiple mycotoxins, pioneering the application of state-of-the-art transfer learning models—including TabPFN—to agricultural mycotoxin forecasting. We systematically benchmarked six architectures (e.g., MLP, TabNet, FT-Transformer) and employed permutation feature importance to identify critical predictors.
Contribution/Results: TabPFN achieved superior performance, reducing average RMSE by 12% and improving AUC by 0.08 over baselines. Meteorological patterns and grain moisture content during the 90-day pre-harvest window emerged as the most discriminative features. This work empirically validates transfer learning for agricultural mycotoxin prediction and reveals the pivotal role of climate-sensitive phenological windows, establishing a generalizable methodological framework for regional mycotoxin risk forecasting.
📝 Abstract
Mycotoxin contamination poses a significant risk to cereal crop quality, food safety, and agricultural productivity. Accurate prediction of mycotoxin levels can support early intervention strategies and reduce economic losses. This study investigates the use of neural networks and transfer learning models to predict mycotoxin contamination in Irish oat crops as a multi-response prediction task. Our dataset comprises oat samples collected in Ireland, containing a mix of environmental, agronomic, and geographical predictors. Five modelling approaches were evaluated: a baseline multilayer perceptron (MLP), an MLP with pre-training, and three transfer learning models; TabPFN, TabNet, and FT-Transformer. Model performance was evaluated using regression (RMSE, $R^2$) and classification (AUC, F1) metrics, with results reported per toxin and on average. Additionally, permutation-based variable importance analysis was conducted to identify the most influential predictors across both prediction tasks. The transfer learning approach TabPFN provided the overall best performance, followed by the baseline MLP. Our variable importance analysis revealed that weather history patterns in the 90-day pre-harvest period were the most important predictors, alongside seed moisture content.