A Linear-Transformer Hybrid for SNP-Based Genotype-to-Phenotype Prediction in Grapevine

📅 2026-05-07
📈 Citations: 0
Influential: 0
📄 PDF

career value

171K/year
🤖 AI Summary
Predicting complex phenotypes such as grapevine leaf trichome density from SNP data remains challenging in variable field environments and across years due to limited model robustness. This work proposes LiT-G2P, a novel framework that uniquely integrates linear models—capturing additive genetic effects—with a Transformer architecture to model nonlinear SNP–SNP interactions. Leveraging genome-wide SNP data, attention mechanisms, and genotype-stratified analysis, LiT-G2P achieves single-year and cross-year root mean square errors (RMSE) of 0.469 and 0.454, respectively, corresponding to prediction accuracies of 79.2% and 74.6%, outperforming existing baselines. Moreover, the model’s attention weights enable identification of biologically interpretable candidate functional SNP markers, enhancing both predictive performance and genomic interpretability in perennial crop breeding.
📝 Abstract
Robust genotype-to-phenotype (G2P) prediction is essential for accelerating breeding decisions and genetic gain. However, it remains challenging to measure complex traits under variable field conditions and across years. In this study, we propose a linear-Transformer approach, LiT-G2P (Linear-Transformer Genotype-to-Phenotype), an automated predictive framework that integrates additive genetic variance effects with Transformer-based nonlinear interactions using genome-wide single-nucleotide polymorphisms (SNPs) data. We evaluated LiT-G2P on a panel of diverse grape accessions, genotyped with SNP markers and measured for phenotypes across two consecutive years. Target phenotypic traits include leaf hair density and trichome density of grapevines. Across both single-year and cross-year testing scenarios, LiT-G2P consistently improves prediction performance compared with baseline models. For hair density, LiT-G2P achieves the lowest error in both single-year and cross-year evaluations, with RMSEs of 0.469 and 0.454, respectively, while maintaining strong tolerance accuracies of 79.2% and 74.6%, respectively. For trichome density, LiT-G2P also presents the best overall G2P performance. In addition, we extract model-prioritized SNPs from attention weights and apply genotype-stratified analysis to provide interpretable candidate marker for downstream validation. These results demonstrate that integrating stable additive effects with learned interaction patterns can enhance cross-year robustness and support practical SNP-based predictive modeling for genomic selection.
Problem

Research questions and friction points this paper is trying to address.

genotype-to-phenotype prediction
SNP
grapevine
complex traits
cross-year robustness
Innovation

Methods, ideas, or system contributions that make the work stand out.

Linear-Transformer hybrid
Genotype-to-Phenotype prediction
SNP-based modeling
Cross-year robustness
Attention interpretability
🔎 Similar Papers
No similar papers found.
Yibin Wang
Yibin Wang
Intern at UIUC
Trustworthy AI
M
Murukarthick Jayakodi
Department of Soil and Crop Sciences, Texas A&M AgriLife Research, Texas A&M University System, Dallas, TX, 75252, USA
S
Silvas Kirubakaran
USDA-ARS, Grape Genetics Research Unit, 630 West North Street, Geneva, New York 14456, USA
A
Ambika Chandra
Department of Soil and Crop Sciences, Texas A&M AgriLife Research, Texas A&M University System, Dallas, TX, 75252, USA
A
Azlan Zahid
Department of Biological and Agricultural Engineering, Texas A&M AgriLife Research, Texas A&M University System, Dallas, TX, 75252, USA