Using machine learning to personalize CRISPR-Cas9 applications
- CSIRO, Sydney, NSW, Australia.
- CSIRO, Data61, NSW, Australia.
Numerous studies have sought to build machine learning models that predict general CRISPR-Cas9 activity and while great progress has been made, these approaches are still limited. Small sequence variations can have a dramatic effect on the CRISPR-Cas9 system, leading to changes in on-target activity or increases the number of off-targets. Despite this risk, current tools are not accounting for genetic variation among a population. To address this, we developed VARiant-aware detection and SCoring of Off-Targets (VARSCOT), which allows researchers to design personalized CRISPR-Cas9 applications for specific individuals or populations. VARSCOT is able to use variant information to identify CRISPR-Cas9 target sites unique to a specific individual or population. We find our tool to be the most sensitive detection method for off-targets, finding 40% to 70% more experimentally verified off-targets compared to other popular software tools. VARSCOT uses a machine learning model to score off-target activity, leading to a 98% reduction in false positives when predicting which off-targets are active. As off-target activity varies with CRISPR-Cas9 concentration, VARSCOT’s model provides a probabilistic scores that accounts for different conditions.