AgriVariant: Variant Effect Prediction using DeepChem-Variant for Precision Breeding in Rice

๐Ÿ“… 2026-02-19
๐Ÿ“ˆ Citations: 0
โœจ Influential: 0
๐Ÿ“„ PDF
๐Ÿค– AI Summary
Precision breeding urgently requires efficient computational tools to predict the functional effects of crop genetic variants. This study presents an end-to-end pipeline for predicting variant effects in rice, integrating the DeepChem-Variant deep learning model with crop-specific genomic annotations from RAP-DB and incorporating Grantham distance and BLOSUM62 matrix scores. The approach introduces a novel pathogenicity scoring method that operates without reliance on external databases and is extensible across crop species. Applied to the OsMT-3a gene, the pipeline predicted the functional impact of all 1,509 single-nucleotide variants within ten days, accurately classifying them into high-, medium-, and low-impact categoriesโ€”an efficiency improvement of over two orders of magnitude compared to conventional wet-lab experiments.

Technology Category

Application Category

๐Ÿ“ Abstract
Predicting functional consequences of genetic variants in crop genes remains a critical bottleneck for precision breeding programs. We present AgriVariant, an end-to-end pipeline for variant-effect prediction in rice (Oryza sativa) that addresses the lack of crop-specific variant-interpretation tools and can be extended to any crop species with available reference genomes and gene annotations. Our approach integrates deep learning-based variant calling (DeepChem-Variant) with custom plant genomics annotation using RAP-DB gene models and database-independent deleteriousness scoring that combines the Grantham distance and the BLOSUM62 substitution matrix. We validate the pipeline through targeted mutations in stress-response genes (OsDREB2a, OsDREB1F, SKC1), demonstrating correct classification of stop-gained, missense, and synonymous variants with appropriate HIGH / MODERATE / LOW impact assignments. An exhaustive mutagenesis study of OsMT-3a analyzed all 1,509 possible single-nucleotide variants in 10 days, identifying 353 high-impact, 447 medium-impact, and 709 low-impact variants - an analysis that would have required 2-4 years using traditional wet-lab approaches. This computational framework enables breeders to prioritize variants for experimental validation across diverse crop species, reducing screening costs and accelerating development of climate-resilient crop varieties.
Problem

Research questions and friction points this paper is trying to address.

variant effect prediction
precision breeding
rice
genetic variants
crop genomics
Innovation

Methods, ideas, or system contributions that make the work stand out.

Deep learning
Variant effect prediction
Precision breeding
Crop genomics
In silico mutagenesis
๐Ÿ”Ž Similar Papers
No similar papers found.