🤖 AI Summary
Off-target effects in CRISPR gene editing remain challenging to predict accurately, hindering its safe application in medicine and agriculture. Addressing the limitations of existing approaches—which often rely on single-gene models with poor generalizability—this work proposes a data-driven machine learning framework that jointly models guide RNA (gRNA) sequences and off-target effects across multiple genes. By moving beyond the conventional single-gene training paradigm, the method achieves high cross-gene prediction accuracy while preserving strong generalization capability. The model attains a prediction accuracy of 84%, substantially enhancing the reliability and practicality of off-target risk assessment for CRISPR-based systems.
📝 Abstract
With the introduction of cyber-physical genome sequencing and editing technologies, such as CRISPR, researchers can more easily access tools to investigate and create remedies for a variety of topics in genetics and health science (e.g. agriculture and medicine). As the field advances and grows, new concerns present themselves in the ability to predict the off-target behavior. In this work, we explore the underlying biological and chemical model from a data driven perspective. Additionally, we present a machine learning based solution named \textit{Guide-Guard} to predict the behavior of the system given a gRNA in the CRISPR gene-editing process with 84\% accuracy. This solution is able to be trained on multiple different genes at the same time while retaining accuracy.