Gaussian Process Modeling with Genotype x Environment Kernels for Wheat Performance Prediction

📅 2025-08-25
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address the instability of wheat cultivar performance across variable environments—posing risks to food security and farming profitability—this study proposes a Gaussian process (GP)-based genotype-by-environment (G×E) prediction model. The method introduces a novel non-Euclidean kernel function explicitly designed to capture sequential and temporal structures, jointly encoding genetic similarity and environmental covariance—thereby relaxing restrictive linear assumptions inherent in conventional G×E models. Crucially, it enables high-accuracy extrapolative prediction of yield and grain protein content for both novel cultivars and unobserved environments, without requiring extensive historical phenotypic data. Empirical evaluation demonstrates superior predictive accuracy over state-of-the-art statistical and machine learning approaches, with robust performance even under severe data scarcity. This work provides a scalable, statistically principled framework to support intelligent breeding decisions and site-specific cultivation strategies.

Technology Category

Application Category

📝 Abstract
Optimizing wheat variety selection for high performance in different environmental conditions is critical for reliable food production and stable incomes for growers. We employ a statistical machine learning framework utilizing Gaussian Process (GP) models to capture the effects of genetic and environmental factors on wheat yield and protein content. In doing so, selecting suitable covariance kernels to account for the distinct characteristics of the information is essential. The GP approach is closely related to linear mixed-effect models for genotype x environment predictions, where random additive and interaction effects are modeled with covariance structures. However, while commonly used linear mixed effect models in plant breeding rely on Euclidean-based kernels, we also test kernels specifically designed for strings and time series. The resulting GP models are capable of competitively predicting outcomes for (1) new environmental conditions, and (2) new varieties, even in scenarios with little to no previous data for the new conditions or variety. While we focus on a wheat test case using a novel dataset collected in Switzerland, the GP approach presented here can be applied and extended to a wide range of agricultural applications and beyond, paving the way for improved decision-making and data acquisition strategies.
Problem

Research questions and friction points this paper is trying to address.

Predicting wheat yield and protein content using genetic and environmental factors
Selecting optimal covariance kernels for genotype-environment interaction modeling
Enabling accurate predictions for new varieties and environmental conditions
Innovation

Methods, ideas, or system contributions that make the work stand out.

Gaussian Process models with specialized kernels
Kernels for strings and time series data
Predicting new environments and varieties competitively
🔎 Similar Papers
No similar papers found.
L
Lea Friedli
Engineering Risk Analysis Group, Technical University of Munich, Germany
T
Tim Steinert
Institute of Mathematical Statistics and Actuarial Science, University of Bern, Switzerland
N
Nathalie Wuyts
Agroscope, Plant-Production Systems, Switzerland
F
Fabian Guignard
METAS, Federal Institute of Metrology, Switzerland
L
Lilia Levy Häner
Agroscope, Plant-Production Systems, Switzerland
D
Didier Pellet
Agroscope, Plant-Production Systems, Switzerland
J
Juan M. Herrera
Agroscope, Plant-Production Systems, Switzerland
David Ginsbourger
David Ginsbourger
University of Bern
Gaussian ProcessesBayesian optimizationUncertainty QuantificationInversionKernels