🤖 AI Summary
In precision medicine, accurately distinguishing benign polymorphisms from pathogenic germline variants remains a critical challenge. To address this, we propose a novel framework integrating evolutionary information with interpretable AI: first, cross-species targeted pretraining on multi-organism genomic data leverages evolutionary conservation to improve pathogenicity modeling—especially in noncoding regions; second, task-specific fine-tuning on ClinVar and HGMD couples a DNA foundation model with a large language model (LLM) to jointly perform variant classification and generate statistically grounded, clinically interpretable explanations. Our method achieves significant performance gains over state-of-the-art tools on ClinVar, notably improving accuracy for both SNVs and non-SNV variants—including indels and splice-site alterations. The framework delivers efficient, reliable computational support for automated genetic testing, clinical variant interpretation, and personalized therapeutic intervention.
📝 Abstract
Distinguishing pathogenic mutations from benign polymorphisms remains a critical challenge in precision medicine. EnTao-GPM, developed by Fudan University and BioMap, addresses this through three innovations: (1) Cross-species targeted pre-training on disease-relevant mammalian genomes (human, pig, mouse), leveraging evolutionary conservation to enhance interpretation of pathogenic motifs, particularly in non-coding regions; (2) Germline mutation specialization via fine-tuning on ClinVar and HGMD, improving accuracy for both SNVs and non-SNVs; (3) Interpretable clinical framework integrating DNA sequence embeddings with LLM-based statistical explanations to provide actionable insights. Validated against ClinVar, EnTao-GPM demonstrates superior accuracy in mutation classification. It revolutionizes genetic testing by enabling faster, more accurate, and accessible interpretation for clinical diagnostics (e.g., variant assessment, risk identification, personalized treatment) and research, advancing personalized medicine.